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Reference iu A Related Application 



The present application is a continuation-in-part of our copending U.S. Patent Application Serial No. 
07 866,045, filed on April 9, 1992, which is incorporated by reference m its entirety. 

5 

Background of the Invention 

The present invention concerns non-A, non-B hepatitis (hereinafter called NANB hepatitis) virus 
genome, polynucleotides, polypeptides, related antigen, antibody and detection systems for detecting 
w NANB antigens or antibodies. 

Viral hepatitis of which DNA and RNA of the causative viruses have been elucidated, and their 
diagnosis and even prevention in some have been established, arc hepatitis A and hepatitis B. The general 
name NANB hepatitis was given to the other forms of viral hepatitis. 

Post-transfusion hepatitis was remarkably reduced after introduction of diagnostic systems for screening 
75 hepatitis B in transfusion bloods. However, there are still an estimated 280,000 annual cases of post- 
transfusion hepatitis caused by NANB hepatitis in Japan. 

NANB hepatitis viruses were recently named CD and E according to their types, and scientists started 
a world wide effort to conduct research for the causative viruses and subsequent extermination of the 
causative viruses 

20 In 1988, Chiron Corp. claimed that they had succeeded in cloning RNA virus genome, which they 

termed hepatitis C virus (hereinafter called HCV), as the causative agent of NANB hepatitis and reported on 
its nucleotide sequence (British Patent 2,212,511 which is the equivalent of European Patent Application 
0,318,216). HCV (C100-3) antibody detection systems based on the sequence are now being introduced for 
screening of transfusion bloods and for diagnosis of patients in Japan and in many other countries. The 

25 detection systems for the C100-3 antibody have proven their partial association with NANB hepatitis; 
however, they capture only about 70% of carriers and chronic hepatitis patients, or they fail to detect the 
antibody in acute phase infection, thus leaving problems yet to be solved even after development of the 
C100-3 antibody by Chiron Corp. 

The course of NANB hepatitis is troublesome and most patients are considered to become carriers, 

so then to develop chronic hepatitis. In addition, most patients with chronic hepatitis develop liver cirrhosis, 
then hepatocellular carcinoma, tt is therefore very imperative to isolate the virus itself and to develop 
effective diagnostic reagents enabling earlier diagnosis. 

The presence of a number of NANB hepatitis which cannot be diagnosed by Chiron's C100-3 antibody 
detection kits suggests a possibility of a difference in subtype between Chiron's HCV and Japanese NANB 

35 hepatitis virus. 

In order to develop NANB hepatitis diagnostic kits of more specificity and to develop effective vaccines, 
it becomes an absolutely important task to analyze each subtype of NANB hepatitis causative virus at its 
genetic and corresponding amino acid level. 

40 Summary of the Invention 

An object of the present invention is to provide the nucleotide sequence coding for the structural protein 
of NANB hepatitis virus and, with such information, to analyze amino acids of the protein to locate and 
provide polypeptides useful as antigen for establishment of detection systems for NANB virus, its related 

45 antigens and antibodies. 

A further object of the present invention is to locate polynucleotides essential to treatment, prevention 
and diagnosis, and polypeptides effective as antigens, by isolating NANB hepatitis virus RNA from human 
and chimpanzee virus carriers, cloning the cDNA covering the whole structural gene of the virus to 
determine its nucleotide sequence, and studying the amino acid sequence of the cDNA. As a result, the 

so inventors have determined the nucleotides of the whole genome of a strain of NANB virus called HC-J6 and 
a strain called HC-J8. NANB hepatitis virus genome of HC-J6 and HC-J8 differ from that of Chiron's HCV. 

Brief Description of the Drawings 



55 Figure 1 shows the restriction map and structure of the coding region of NANB hepatitis virus genome 

(HC-J6) and positions of clones. C, E, NS-1, NS-2, NS-3, NS-4 and NS-5 are the abbreviation of core, 
envelope, non-structure-1 , -2, -3, -4 and -5. 
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Figures 2 to 4 show method of determination of the nucleotide sequence of 5* terminus of NANB 
hepatitis virus genome of strains HC-J1, HC-J4 and HC-J6 respectively. 

Figure 5 shows the method of determination of the nucleotide sequence of 3' terminus of HC-J6 
genome. Solid lines show nucleotide sequences determined by clones from libraries of bacteriophage 
lambda gttO, and broken lines show nucleotide sequences determined by clones obtained by PCR. 

Figure 6 shows the structure of coding region of NANB hepatitis virus genome (HC-J8) and positions of 
clones. Regions a to n indicate positions of amplification by PCR. 



Detailed Description of the Invention 



The present invention provides NANB hepatitis virus genome RNA for strain HC-J6 (sequence list 1) 
consisting of 340 nucleotides on the 5' terminus that follow an open reading frame consisting of 9099 
nucleotides coding for the structural protein and non-structural protein that follow a noncoding region 
consisting of 150 nucleotides containing an U-stretch consisting of 108 uracils on the 3' terminus of NANB 
15 hepatitis virus, and NANB hepatitis virus genome having substantially the nucleotide sequence of sequence 
list 1. 

The present invention provides polynucleotide N-9589 {strain HC-J6) comprising the DNA nucleotide 
sequence of sequence list 2; cDNA clone J6-081 comprising the nucleotide sequence of sequence list 3; 
cDNA clone J6-08 comprising the nucleotide sequence of sequence list 4; and NANB hepatitis virus 
20 polynucleotides having substantially the sequence of nucleotides of NANB hepatitis virus nucleotides shown 
in sequence lists 2 through 4. 

The invention provides polypeptide coded for by genome or polynucleotide of HC-J6 above, polypep- 
tide P-J6-3033, comprising the polypeptide sequence of sequence list 5, polypeptides produced by using 
recombinant genome, recombinant polynucleotides and recombinant cDNA of whole or a part of cDNA 
25 above, and polyclonal or monoclonal antibodies against the polypeptides described above. 

The present invention also provides NANB hepatitis virus genome for strain HC-J8 comprising 
sequence list 6, NANB hepatitis virus RNA consisting of noncoding region consisting of 341 nucleotides on 
5' terminus followed by an open reading frame consisting of 9099 nucleotides coding for the structural 
protein and non-structural protein followed by a noncoding region consisting of 71 nucleotides containing an 
30 U-stretch consisting of 30 uracils on 3* terminus of NANB hepatitis virus comprising sequence list 6, and 
NANB hepatitis virus genome having substantially the nucleotide sequence of sequence list 6. 

The present invention provides polynucleotide N-9511 for strain HC-J8 comprising the DNA nucleotide 
sequence of sequence list 7 and NANB hepatitis virus polynucleotide having substantially the sequence of 
nucleotides of NANB hepatitis virus nucleotides comprising sequence list 7. 
35 The invention provides polypeptide coded for by genome or polynucleotide of HC-J8 above, polypep- 
tide P-J8-3033, comprising the polypeptide sequence of sequence list 8 and polypeptide P-J8-3033-2 
comprising the polypeptide sequence of sequence list 9, polypeptides produced by using recombinant 
genome, recombinant polynucleotides and recombinant cDNA of whole or a part of cDNA above, and 
polyclonal or monoclonal antibodies against the polypeptides described above. 
40 The present invention, furthermore, provides NANB hepatitis diagnostic system using polypeptides or 
antibodies described above. 

In the method described below, NANB hepatitis virus RNA of the present invention was obtained and its 
nucleotide sequence was determined. 

Plasma samples (HC-J1 , HC-J4, HC-J6 and HC-J8) were obtained from human and chimpanzee. HC- 
45 J1, HC-J6 and HC-J8 were obtained from Japanese blood donors who had tested positive for HCV 
antibody. HC-J4 was obtained from the chimpanzee subjected to the challenge test but was negative for 
Chiron's C100-3 antibody previously mentioned. 

RNA was isolated from each of the plasma samples. Following the study of 5' terminus of approxi- 
mately 2,500 nucleotides and 3' terminus of approximately 1,100 nucleotides disclosed in Japanese patent 
50 application No. 196175/91, the inventors have completed the study of the region coding for non-structural 
protein of strain HC-J6 and the study of the full length sequence of 9,589 nucleotides of HC-J6 genome 
RNA and have completed the study of the region coding for non-structural protein of strain HC-J8 and the 
study of the full length sequence of 9,589 nucleotides of HC-J8 genome RNA. 

As described in the Example below, strain HC-J6 had a 5* noncoding region consisting of 340 
5b nucleotides, and strain HC-J8 had a 5' noncoding region consisting of 341 nucleotides, followed by region 
coding for structural protein and region coding for non-structural protein. 

Concerning the 3' terminus, strain HC-J6 was found to have a region consisting of 150 nucleotides 
containing an U-stretch consisting of 103 uracils following after the region coding for non-structural protein 
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and strain HC-JG was found to have a region consisting of 71 nucleotides containing an U-stretch consisting 
of 30 uracils following after the region coding for non-structural protein. 

The coding region starting with adenine (341st nucleotide from the 5' terminus for strain HC-J6 and 
342nd nucleotide from the 5' terminus for strain HC-J8) was found to have a long Open Reading Frame 
5 consisting of 9099 nucleotides which cedes for 3033 amino acids. HCV or hepatitis C virus is supposed to 
be closely allied to flavivirus in regard to its genetic structure. The coding of the NANB hepatitis virus 
genome of the present invention was considered to be consisting of regions named C (core), E (envelope), 
NS-1 (non-structural-1 ), NS-2 (non-structural-2), NS-3 (non-structural-3) NS-4 (non-structural-4) and NS-5 
(non-structural-5). 

w As compared with the sequence of HCV disclosed in the European Patent Application by Chiron Corp. 

(Publication No. 388,232), homology of sequences of the strain HC-J6 was 67.9% for the full nucleotide 
sequence and 72.3% for the full amino acid sequence, and homology of sequences of the strain HC-J8 was 
66.4% for the full nucleotide sequence and 71.0% for the full amino acid sequence. 

From an eomination of homology for regions, the homology of nucleotide sequences (strain HC-J6) of 

is the 5' terminal noncoding region was 94.4% and that of the amino acid sequences of the C region was 
90.1 %, showing comparatively high homology; on the other hand, concerning lower stream than envelope, 
homologies of amino acid sequence were found to be as low as 60.4% for E, 71.1% for NS-1, 57.8% for 
NS-2. 81.1% for NS-3, 73.1% for NS-4, and 69.9% for NS-5. As a result, HC-J6 strain was found to be 
significantly different from HCV strain found by Chiron Corp. 

20 From an eomination of homology for regions, the homology of nucleotide sequences (strain HC-J8) of 

the 5' terminal noncoding region was 93.8% and that of the amino acid sequences of the C region was 
90.1 %, showing comparatively high homology; on the other hand, concerning lower stream than envelope, 
homologies of amino acid sequence were found to be as low as 54.7% for E, 73.1% for NS-1, 55.6% for 
NS-2. 81 3% for NS-3, 72.1% for NS-4, 67.3% for NS-5, and 25.9% for 3' terminal noncoding region. As a 

25 result, HC-J8 strain was found to be significantly different from HCV strain found by Chiron Corp. 

From the comparison of amino acid sequence of HC-J6 strain with strain HC-J1 (American type) and 
strain HC-J4 (Japanese type) disclosed by the inventors (Japan. J Exp. Med. (1990), 60: 167-177), 
homology in the core region was more than 90% for each strain while that in the envelope region was 
60.9% for HC-J1 and 53.1 % for HC-J4. Thus, in the present invention, strain HC-J6 was found to be a 

20 different type of virus than strains HC-J1 or HC-J4. 

From the comparison of amino acid sequence of HC-J8 strain with strain HC-J1 (type I) and strain HC- 
J4 (type II), homology of approximately 3,000 nucleotides of 5' terminus was 70.1% for HC-J1 and 67.1% 
for HC-J4, and from the comparison of alt nucleotides with HC-J6 (type III) genome homology was as low 
as 76.9%. On the other hand, HC-J8 showed high homology with strain HC-J7 (type IV) disclosed in 

35 Japanese patent application 196175/91 as 93.1% for approximately 3,000 nucleotides of 5' terminus. 

Nucleotides among stains assumed to belong to same type were supposed to show high homology. For 
e> ample, homology of 95.6% for approximately 3,000 nucleotides of 5 1 terminus between HCV disclosed by 
Chiron Corp. and HC-J1 appears to show that they should be classified into type I. On the other hand, low 
homology of HC-J8 with HCV, HC-J1, HC-J4 and HC-J6 appeared to show that it was not to be classified 

40 into type I, II or III, but into type iV (the same as HC-J7). 

Strain HC-J8 has some mutations in the nucleotides as shown in sequence lists 6 and 7 by symbols M, 
R W, S, Y, K and B. It also can be easily understood that it has some mutations of ammo acids from 
comparison of sequences in sequences lists 8 and 9. Mutation of nucleotides was observed up to 
approximately 1.4% in the whole genome and that of amino acids was observed up to approximately 1.7% 

45 in whole ORF Thus the present invention includes genomes, polynucleotides and polypeptides of strain 
HC-J8 having some mutations. 

In addition, envelope (E) region (576 nucleotides T 92 amino acids of amino acids 192-383) and NS-1 
region (1050 nucleotides/350 amino acids of amino acids 384-733) having many mutations in HC-J8 are 
called hyper-variable region since mutations were observed as 20 nucleotides^ amino acids (3.47%/3.64%) 

so in E region and 37 nucleotidesT 9 amino acids (3.52%/5.42%) in NS-1 region. According to these findings, 
the present invention can be recognized to include genomes and polypeptides coded for by the genomes 
of strain HC-J8 having mutations of 3.5% to 5.5% in those regions. 

The genome, polynucleotide, and cDNA clones of the present invention can be used as material to 
produce peptides of the invention by integration into a host genome, e.g. E. colt or Bacillus, by means of 

55 known genetic engineering techniques. 

Polypeptides of the invention are useful as material for diagnostic agents to detect NANB hepatitis 
antibodies with high specificity and as material to produce polyclonal and monoclonal antibodies by known 
techniques. 
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Polyclonal and monoclonal antibodies of the invention are useful as materials for diagnostic agents to 
detect NANB hepatitis antigens with high specificity. 

A detection system using each polypeptide of the present invention or polypeptide with partial 
replacement of amino acids, and a detection system using monoclonal or polyclonal antibodies to such 
polypeptides, are useful as diagnostic agents of NANB hepatitis with high specificity and are effective to 
screen out NANB hepatitis virus from transfusion bloods or blood derivatives. The polypeptides, or 
antibodies to such polypeptides, can be used as a material for a vaccine against NANB hepatitis virus. 

It is well known in the art that one or more nucleotides in a DNA sequence can be replaced by other 
nucleotides in order to produce the same protein. The present invention also concerns such nucleotide 
10 substitutions which yield DNA sequences which code for polypeptides as described above. It is also well 
known in the art that one or more amino acids in an amino acid sequence can be replaced by equivalent 
other amino acids, as demonstrated by U.S. Patent No. 4,737,487 which is incorporated by reference, in 
order to produce an analog of the amino acid sequence. Any analogs of the polypeptides of the present 
invention involving amino acid deletions, amino acid replacements, such as replacements by other amino 
75 acids, or by isosteres (modified amino acids that bear close structural and spatial similarity to protein amino 
acids), ammo acid additions, or isosteres additions can be utilized, so long as the sequences elicit 
antibodies recognizing NANB antigens. 

Examples of application of this invention are shown below, however, the invention shall in no way be 
limited to those examples 

Examples 



The 5' terminal nucleotide sequence and amino acid sequence of NANB hepatitis virus genome were 
determined in the following way: 

25 

(1) Isolation of RNA 

RNA of the sample (HC-J1, HC-J6, HC-J8) from plasma of Japanese blood donor testing positive for 
HCV (C100-3) antibody (by Ortho HCV Ab ELISA, Ortho Diagnostic System. Tokyo), and that of the sample 
30 (HC-J4) from the chimpanzee challenged with NANB hepatitis for infectivity and negative for HCV antibody 
were isolated in the following method: 

Each plasma sample was added with Tris chloride buffer (10 mM, pH 8.0) and centrifuged at 68 x 10 3 
rpm for 1 hour. Its precipitate was suspended in Tris chloride buffer (50 mM. pH 8.0) containing 200 mM 
NaCl, 10 mM EDTA, 2% (w/v) sodium dodecyl sulfate (SDS), and proteinase K 1 mg/ml, incubated at 60 °C 
35 for 1 hour, then their nucleic acids were extracted by phenol/chloroform and precipitated by ethanol to 
obtain RNA. 

(2) HC-J1 and HC-J8 c DNA Synthesis 

40 After heating the RNA isolated from HC-J1 or HC-J8 plasma at 70 e C for 1 minute, this was used as a 
template; 10 units of reverse transcriptase (cDNA Synthesis System Plus, Amersham Japan) and 20 pmol 
of oligonucleotide primer (20 mer) were added and incubated at 42 6 C for 1.5 hours to obtain cDNA. Primer 
#8 (5'- GATGCTTGCGGAAGCAATCA - 3') was prepared by referring to the basic sequence shown in 
European Patent Application No. 88310922.5, which is relied on and incorporated herein by reference. 

45 

(3) cDNA Was Amplified by the following P olymerase Chain Reaction (PCR) 

cDNA was amplified for 35 cycles according to Saiki's method (Science (1988) 239: 487-491) using 
Gene Amp DNA Amplifier Reagent (Perkin-Elmer.Cetus) on a DNA Thermal Cycler (Perkin-Elmer.Cetus). 
so For cDNA synthesis and for PCR for HC-J8, synthesized primers disclosed in Japanese patent 
application 153402/90 and those based on HC-J1 , HC-J4 and HC-J6 genomes disclosed in Japanese patent 
applications 196175/91 and below were utilized. 

(4) Determination of 5' Terminal Nucleotide Sequence of HC-J1 and H C-J4 by Assembling cDNA Clones 

55 

As shown in Figures 2 and 3, nucleotide sequences of 5* termini of the genomes of strains HC-J1 and 
HC-J4 were determined by combined analysis of clones obtained from the cDNA library constructed in 
bacteriophage \gt10 and clones obtained by amplification of HCV specific cDNA by PCR. 
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Figures 2 and 3 show 5' termini of NANB hepatitis virus genome together with cleavage site by 
restriction endonuclease and sequence of primers used. In the figures, solid lines are nucleotide sequences 
determined by clones from bacteriophage \gt10 library while dotted lines show sequences determined by 
clones obtained by PCR. 

5 A 1656 nucleotide sequence of HOJ1 spanning nt454-2109 was determined by clone 041 which was 

obtained by inserting the cDNA synthesized with the primer #8 into Xgt10 phage vector (Amersham). 

Another primer #25 (5'- TCCCTGTTGCATAGTTCACG -3') corresponding to nt824-843 was synthesized 
based on the 041 sequence, and four clones (060, 061, 060 and 075) were obtained to cover the upstream 
sequence nt1 8-843. 

10 

(5) Determination of 5' Terminal Nucleotide Sequence of HC-J6. 

The nucleotide sequence of the 5' terminus of strain HC-J6 was determined from analysis of clones 
obtained by PCR amplification as shown in Figure 4. 
75 Isolation of RNA from HC-J6 and determination of its sequence was made in the same manner as 

described in (2) above. Sequences in the range of nt24-2551 of the RNA were determined from consensus 
sequence of respective clones obtained by amplification by PCR using each pair of primers based on 
nucleotide sequence of HC-J4. 



nt24-826 

#32 (5 ' -ACTCCACCATAGATCACTCC-3 ' ) 
#122 ( 5 ' -AGGTTCCCTGTTGCATAATT- 3 * ) 
Clones: C9397, C9388, C9764 



nt732-1907 

#50 (5' -GCCGACCTCATGGGGTACAT-3 ' ) 
#128 (5 f -TCGGTCGTGCCCACTACCAC-3 ' ) 



Clones : C9 3 1 6 , C97 52 , C97 53 

40 



ntl847-2571 

#149 ( 5 1 -TCTGTGTGTGGCCCAGTGTA-3 ' ) 
#146 ( 5 ' - AGTAGCATCATCCACAAGCA- 3 1 ) 
Clones: CI 162 1 , CI 1624 , CI 1655 



In order to determine further upstream of the 5' terminus, antisense primer #36 (5'- AACACTACTCGG- 
CTAGCAGT -3') corresponding to nt246-265, followed by dAs were added to 5' terminus of cDNA using 
terminal deoxynucleotidyl transferase, and one-sided PCR amplification was made twice as described 
t5 below. 

cDNA was amplified for 35 cycles as first stage PCR using oligo dT primer (20-mer) and antisense 
primer #48 (5'-GTTGATCCAAGAAAGGACCC -3') of nt1 88-207, followed by the second stage of PCR by 30 
cycle amplification us'ng the first PCR product as a template, oligo dT primer (20 -mer) and antisense 
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40 



45 



50 



55 



primer #109 (21-mer; 5'-ACCGGATCCGCAGACCACTAT-3') corresponding to nt140 to 160. The obtained 
PGR product was subcloned to M13 phage vector. 

Nucleotide sequence from ntl to 23 was determined from consensus sequence of 13 isolated clones 
C9577, C9579. C9581, C9587, C9590, C9591, C9595, C9606, C9609, C9615, C9616 and C9619 obtained 
3 above which were considered having complete 5' terminus. 

(6) Determination of nucleotide sequence of HC-JG middle region. 

o mn C ? N f T COns,ruC,ed with usin 9 x 9t10 according to the method described in (2) above from 

o 100ml of HC-J6 plasma as a starting materials. Primers #162 and #81 were prepared for synthesis by 

refemng to the basic sequence shown in the European Patent Application Publication No. 318 216 Clones 

wore selected by plaque hybridization. 

Nucleotide sequence from 2552 to 8700 was determined from consensus sequence of four obtained 

cDNA clones 02 (nt6996 to 8700), 06(nt6485 to 8700), 0 8(nt6OO8 to 8700) and 081 (nt2199 to 6168) a* 
5 shown ,n Figure 1 Clones 081 and 08 were found to have nucleotide sequences shown in sequence lists 3 

and 4 respectively. 

[ 7) Determination of 3' t e rminal nucleo tide sequence of HC-J6 strain. 

As shown in Figure 5, the nucleotide sequence of the 3' terminus of HC-J6 genome was determined by 
analysis of clones obtained by amplification of HCV specific cDNA by PCR 

Nucleotide sequence of HC-J6 from nt8701 to 9241 was determined from consensus sequence of three 
clones consisting of 938 nucleotides, C9760, C9234 and C9761, obtained by amplification of sample using 
primer #80 (5 -GACACCCGCTGTTTTGACTC-3') and #60 (5'-GTTCTTACTGCCCAGTTGAA-3') 
below UCle ° tlde SeqU6nCe ° f 3 ' terminUS d0wn Stream from nt9242 was determined in the method described 

RW A SOlat ' 0r i RN 1 ^ HC " J6 WaS madS in the Same manner as described in < 1 > ^ove. The obtained 
RNA was added poly (A) to its 3' terminus using poly (A) polymerase and cDNA was synthesized using 
ohgo (dT) 2c as a primer, and obtained cDNA was provided to PCR as a template 

F,rst PCR product was made with using #97 (5'-AGTCAGGGCGTCCCTCATCT-3') as a sense primer 

QCC^CcV^crr^TrT^ P ° R Pr ° dUCt W3S made with usin 9 #90 < 5 '- 

. ,IJ7 GCGGCCGATATCT - 3 > corresponding to downstream sequence of #97 as a sense primer and 
ohgo (dT) 23 as an antisense primer as well as first PCR product. PCR product obtained by two step 
amplification was smoothened on both ends by treatment with T 4 DNA polymerase, followed by 
phosphoryla ion of 5 terminus by T 4 polynucleotide kinase. The obtained product was subcloned into Hinc M 
position of M13mpl9 phage vector. 

ClO^iTr^ W3S determined ,rom consensus sequence of 19 obtained Cones, 

C10311, C10313 C10314, C10320, C10322, C10323, C10326, C10328. C10330 C10333 C10334 C10336 
C10337, C10345, C10346, C10347, C10349, C10350 and C10357. ' 

As a result, the nucleotide sequence of cDNA to HC-J6 genome RNA was determined as shown in 
sequence list 2, and full sequence of genome RNA was determined as shown in sequence list 1. 

(8) Determination of amino acid s equences. 

According to the nucleotide sequence of the genome of strain HC-J6, determination was made of 
sequence of coded region starting with ATG. As a result, HC-J6 genome was found to have a long Open 
Reading Frame cooing for polypeptide precursor consisting of 3033 amino acid residues. 

(9) Determination of 5' terminal nucleotide sequence of HC-J8 

As shown in Rgure 6, the nucleotide sequence of 5' terminus of HC-J8 genome (a region) was 
determined by analysis of clones obtained by amplification of HCV specific cDNA by PCR 

? m t 9 '!; Stra o d c ed CDNA WaS s V nthesi2ed "Sing antisense primer #36 (5-AACACTACTCGGCTAGCAGT- 
3 ) of n 246 to 265 ,n the same manner as (2) above, then it was added with dATP tail at its 3' terminus by 
terminal deoxynucleotidyl transferase, then amplified by one-sided PCR in two stages 

That ,s, in the first stage, antisense primer #48 (5'-GTTGATCCAAGAAAGGACCC-3') of nt188 to 207 
was used with , sense primer selected from non-specific primer #165 (5'-AAGGATCCGTCGACATCGATAAT- 
(A) ,? " 3) 3nd #171 < 5, - AA GQATCCGTCGACATCGATAATACG(T), 7 -3') to amplify the dA-tailed cDNA 
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by PGR iO" 35 cycles; and in the second stage, using the product of the first-stage PGR as a template, non- 
specific pr;mer #166 (5' AAGGATCCGTCGACATCGAT -3') and antisense primer #109 ^i-mer, G'-ACCG 
GATCCGCAGACCACTAT -3') were added to initiate PCR for 30 cycles. The product of PCR was subcloned 
to M1 3 phage vector. 

5 Thirteen independent clones (poly dT-tailed: C1 4951 ,C1 4952, C14953. C14958, C14960, C14968, 

C14971, C14972 and C14974; poly dA-tailed: C14987, C14996, C14999 and C15000) were obtained (each 
considered having complete tength of 5' terminus), and the consensus sequence of nt1 -1 39 of the 
respective clones was determined. 

70 (10) cDNA amplification of ORF region and 3' terminus by PCR 



As shown in Figure 6, the nucleotide sequence of downstream from nt140 of HC-J8 genome was 
determined by analysis of clones obtained by amplification of HCV specific cDNA by PCR. 

Single-stranded cDNAs to HC-J8 RNA were synthesized in the same manner as (2) above using 
75 antisense primers described below, then they were amplified by PCR using sense and antisense primers 
described below. Each product of PCR was subcloned to M13 phage vector, then consensus sequence of 
the respective clones of each region was determined. 

The primers for cDNA synthesis and PCR amplification, and the numbers of obtained clones are shown 
beiow for each region. Alphabetical symbol of each amplified region corresponds to that in Figure 6. 
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h region 
nt45-847 

Primer for cDNA synthesis: #122 ( 5 ' - AGGTTCCCTGTTGCATAATT- 3 ' ) 
Primer for PCR: sense: #32A ( 5 ' -CTGTGAGGAACTACTGTCTT-3 ' ) 

antisense #122 
Clones: 0152 21,015222,015223 

c region 
nt732-1354 

Primmer for cDNA synthes i s : #54 ( 5 ' - ATCGCGTACGCCAGGATCAT- 3 ' ) 
Primer for PCR: sense: #50 ( 5 ' -GCCGATCTCATGGGGTACAT-3 ' ) 

antisense : #54 
Clones : CI 52 56 , C 1 52 57 , C 1 5 2 5 8 

d region 
ntl300-1879 

Primer for cDNA synthesis: #199 ( 5 ' -GGGGTGAAACAATACACCGG- 3 1 ) 
Primer for PCR: sense: #205 ( 5 1 -GGGACATGATGATCAACTGG- 3 1 ) 

antisense: #199 
Clones: CI 4221 , CI 4222 , CI 4223 
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o region 
ntl833-2518 

Primer for cDNA synthesis: #146 ( 5 ' - AGTAGCATCATCCACAAGCA- 3 ' ) 
Primer for PCR: sense: #150 ( 5 1 - ATCGTCTCGGCTAAGACGGT-3 1 ) 

antisense: #146 
Clones : 011535,011 540, Cll 56 6 

f region 
nt2433-3451 

Primer for cDNA synthesis: #170 ( 5 1 -GCATAAGCAGTGATGGGGGC-3 ' ) 
Primer for PCR: sense: #160 ( 5 1 -CAGAACATCGTGGACGTGCA-3 ' ) 

antisense: #170 
Clones : CI 53 4 8 , CI 5 34 9 , CI 53 56 

g region 
nt3404-4300 

Primer for cDNA synthesis: #225 ( 5 ' -TCGCATATGATGATGTCATA- 3 ' ) 
Primer for PCR: sense: #238 ( 5 1 -CTACACCTCC AAGGGGTGGA- 3 * ) 

antisense: #225 
Clones ; CI 570 1 , Cl 5702 , CI 570 3 

h region 
nt4221-5015 

Primer for cDNA synthesis: #216 ( 5 * -GTGGTCTAGACATACGGGC A- 3 1 ) 
Primer for PCR: sense: #230 ( 5 * -CCCATCACGTACTCCACATA-3 1 ) 

antisense: #216 
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Clones : CI 5391 , CI 5392 , CI 5393 

i region 
nt4695-5062 

Primer for cDNA synthesis: #210 ( 5 ' -GCATCTATGTGTGTGAGGCC- 3 ' ) 
Primer for PCR: sense: #209 ( 5 ' -TTCGACTCCGTGATCGACTG- 3 ' ) 

antisense: #210 
Clones: C 1 408 7 , C 1 4 0 8 8 , C 1 40 8 9 

j region 
nt5021-6169 

Primer for cDNA synthesis: #162 ( 5 ' -TCCGACTCCGTCACGTAGTG-3 ' ) 
Primer for PCR: sense: #227 ( 5 ' -GTTCTGGGAAGCGGTCTTTA-3 ' ) 

antisense: #162 
Clones : CI 542 1 , CI 5422 , CI 542 3 

k region 
nt6027-6889 

Primer for cDNA synthesis: #232 ( 5 ' -GATGGGTCTGTTAGCATGGA- 3 1 ) 
Primer for PCR: sense: #242 { 5 ' -TTGGTAGTGGGAGTCATCTG-3 * ) 

antisense: #232 
Clones : C157 33 / C157 34,C157 35 

1 region 
nt6834-7735 

Primer for cDNA synthesis #239 ( 5 ' -ATCGGTAACTTCTCCTCTTC-3 1 ) 
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Primer for PCR: sense: #241 ( 5 ' -CCTTGCGATCCTGAACCTGA- 3 ' ) 

ant isense : #2 39 
Clones : CI 579 8 , CI 579 9 , CI 5800 



m region 
nt7656-8630 

Primer for cDNA synthesis: #222 ( 5 ' -GACCAGGTCGTCTCCACACA- 3 * ) 
Primer for PCR: sense: #229 ( 5 ' -GTCGTGTGCTGCTCCATGTC-3 ' ) 

antisense: #222 
Clones : CI 5376 , CI 537 8 , C 15381 



n region 

25 

nt8325-9511 

Primer for cDNA synthesis : #165 
so Primer for PCR: sense: #80 ( 5 • -GAC ACCCGCTGTTTTGACTC - 3 ' ) 

non-specific: #165 
Clones : C 1 5270 , C 1 527 1 , CI 5272 

35 

From the analysis described above, full nucleotide sequence of cDNA to HC-J8 was determined as 
shown in sequence list 7, then full nucleotide sequence of HC-J8 genome RNA as shown in sequence list 6. 
Two amino acid sequences shown in sequence lists 8 and 9 represent those coded for by HC-J8 genome. 

40 Utilizing known immunological techniques, it is possible to determine epitopes (eg, from the core 

region, etc.) from the polypeptides of sequence lists 5, 8 and 9. Determination of such epitopes of the 
NANB hepatitis virus opens access to chemical synthesis of the peptide, manufacturing of the peptide by 
genetic engineering techniques, synthesis of the polynucleotides, manufacturing of the antibody, manufac- 
turing of NANB hepatitis diagnostic reagents, and development of products such as NANB hepatitis 

45 vaccines. 

According to the well-known method described by Merrifield, NAMB peptides can be synthesized. 
Furthermore, the polynucleotides in sequence lists 2-4 and 7 can be used to express polypeptides in host 
cells such as Escherichia coii by means of genetic engineering technique. 

A detection system for antibody against NANB hepatitis virus can be developed using polyvinyl 

so microtiter plates and the sandwich method. For example, 50ul of 5 ug/ml concentration of a NANB peptide 
can oe dispensed in each well of the microtiter plates and incubated overnight at room temperature for 
consolidation. The microplate wells can be washed five times with physiological saline containing 0.05% 
Tween 20. For overcoating, 100 ul of NaCI buffer containing 30% (v/v) of calf serum and 0.05% Tween 20 
(CS buffer) can be dispensed in each well and discarded after incubation for 30 minutes at room 

55 temperature. 

For determination of NANB antibodies in samples, in the primary reaction, 50ul of the CS buffer 
containing 30% calf serum and 10 ul of a sample can be dispensed in each microplate well and incubated 
on a microplate vibrator for one hour at room temperature. After completion of the reaction, micropiate wetls 
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can be washed five times in the same way as previously described 

In the secondary reaction, as labeled antibody 1 ng of horseradish peroxidase labeled anti-human IgG 
mouse monoclonal antibodies (Fab' fragment: 22G, Institute of Immunology Co., Ltd., Tokyo, Japan) 
dissolved in 50 jul I of cair serum can be dispensed in each microplate well, and incubated on a microplate 

5 vibrator for one hour at room temperature. Wells can be washed five times in the same way. After addition 
of hydrogen peroxide (as substrate) and 50 ul of O-phenylendiamine solution (as color developer) in each 
well, and after incubation for 30 minutes at room temperature, 50 ul of 4M sulphuric acid can be dispensed 
in each well tc stop further color development and for reading absorbance at 492 nm. 

The cut-off level of this assay system can be set by measuring a number of donor samples with normal 

ro serum ALT (alanine aminotransferase) value of 34 Karmen unit or below and which tested negative for anti- 
HCV. 

The present invention makes possible detection of NANB hepatitis virus infection which could not be 
detected by conventional determination methods, and provide NANB hepatitis detection kits capable of 
highly specific and sensitive detection at an early phase of infection. 
75 These features allow accurate diagnosis of patients at an early stage of the disease and also help to 

remove at higher rate NANB hepatitis virus carrier bloods through screening test of donor bloods. 

Polypeptides and their antibodies under this invention can be utilized for manufacture of vaccines and 
immunological pharmaceuticals, and structural gene of NANB hepatitis virus provides indispensable tools 
for detection of polypeptide antigens and antibodies 
20 Antigen-antibody complexes can be detected by methods known in this art. Specific monoclonal and 

polyclonal antibodies can be obtained by immunizing such animals as mice, guinea pigs, rabbits, goats and 
horses with NANB peptides (e.g., bearing NANB hepatitis antigenic epitope) 

The present invention is based on studies on isolated virus genome of NANB hepatitis virus named HC- 
J6 and HC-J8, and is completed by clarification of the full sequence of the nucleotides. The invention 
25 makes possible highly specific detection of NANB hepatitis virus and provision of polypeptide, polyclonal 
antibody and monoclonal antibody to prepare the test system. 

Further variations and modifications of the invention will become apparent to those skilled in the art 
from the foregoing and are intended to be encompassed by the claims appended hereto. 

Japanese Priority Applications 287402/91 filed August 9, 1991 and 36044V91 filed on December 5, 
30 1991 are relied on and incorporated by reference. U.S. patent applications serial no. 07/540,604 (filed June 
19, 1990), 07/653,090 (filed February 8, 1991) , and 07/712,875 (filed June 11, 1991) are incorporated by 
reference in their entirety. 



Sequence list 

35 

Sequence list 1 
Sequence list 2 
Sequence list 3 
Sequence list 4 
40 Sequence list 5 
Sequence list 6 
Sequence list 7. 
Sequence list 8: 
Sequence list 9: 

45 

Claims 

1. Recombinant RNA of non-A, non-B hepatitis virus, strain HC-J6, comprising the nucleotide sequence of 
sequence list 1 . 

50 

2. Recombinant cDNA of non-A, non-B hepatitis virus, strain HC-J6, comprising the nucleotide sequence 
of sequence list 2. 

3. cDNA clone J6-081 comprising the nucleotide sequence of sequence list 3. 

55 

4. cDNA clone J6-08 comprising the nucleotide sequence of sequence list 4. 



whole nucleotides of HC-J6 genome RNA 

N-9589 whole nucleotides of cDNA to HC-J6 genome RNA 

J6-08I nucleotides of clone J6-081 

J6-08 nucleotides of clone J6-08 

P-J6-3033 whole amino acids of ORF of HC-J6 genome 
whole nucleotides of HC-J8 genome RNA 
whole nucleotides of cDNA to HC-J8 genome RNA 
whole amino acids of a variation of ORF of HC-J8 genome 
whole amino acids of a variation of ORF of HC-J8 genome 
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70 8. 



9. 



5. Am; no acid sequence corresponding to recombinant cDNA o> non-A, non-B hepatitis vrus, strain HC- 
J6. comprising the amino acid sequence of sequence list b. 

6. Recombinant RNA of non-A, non-B hepatitis virus, strain HC-J8, comprising the nucleotide sequence of 
sequence list 6. 

7. Recombinant cDNA of non-A, non-B hepatitis virus, strain HC-J8, comprising the nucleotide sequence 
of sequence list 7, 

Ammo acid sequence corresponding to recombinant cDNA of non-A. non-B hepatitis virus, strain HC- 
J8, comprising the amino acid sequence of sequence list 8. 

Ammo acid sequence corresponding to recombinant cDNA of non-A, non-B hepatitis virus, strain HC- 
J8, comprising the amino acid sequence of sequence list 9. 

10. A non-A, non-B hepatitis diagnostic test kit for analyzing samples for the presence of antibodies 
directed against a non-A, non-B hepatitis antigen, comprising an antigen attached to a solid substrate 
and labeled anti-human immunoglobulin; wherein said antigen is an antigen selected from the antigens 
contained in sequence lists 5, 8 or 9. 

11. A method of detecting antibodies directed against a non-A, non-B hepatitis antigen in a sample, said 
method comprising: 

(a) reacting said sample with an antigen selected from the antigens contained in sequence lists 5, 8 
or 9 to form antigen-antibody complexes, and 
25 (b) detecting said antigen-antibody complexes. 

12. A non-A. non-B hepatitis specific monoclonal or polyclonal antibody reactive with an antigen, said 
antigen is an antigen selected from the antigens contained in sequence lists 5, 8 or 9. 

20 13. A method of detecting non-A, non-B hepatitis antigen in a sample, said method comprising: 

(a) reacting said sample with the non-A, non-B hepatitis monoclonal or polyclonal antibody according 
to claim 12 to form antigen-antibody complexes; and 

(b) detecting said antigen-antibody complexes. 
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Sequence ID No. 1 

Sequence Length: 9,589 

Sequence Type: nucleic acid 

S handedness: single 

Topology: linear 

Molecule Type: genomic RNA 

Method for Determination of Feature: E 



ACCCGCCCCU AAUAGQGGCG ACACUCCGCC AUGAACCACU CCCCUGUGAG GAACUAC'JGU 60 

CUUCACGCAG AAAGCGUCUA GCCAUGGCGU UAGUAUGAGU GUCGUACAGC CUCCAGGCCC 120 

CCCCCUCCCG GGAGAGCCAJ AGUGGUCUGC GGAACCGGUG AGUACACCGG AAUUGCCGGG 180 

AAGACUGGGU CCUUUCUUGG AUAAACCCAC UCUAUGCCCG GUCAUUUGGG CGUGCCCCCG 240 

CAAGACUGCU AGCCGAGUAG CGUUGGGUUG CGAAAGGCCU UGUGGUACUG CCUGAUAGGG 300 

UGCUUGCGAG UGCCCCGGGA GGUCUCGUAG ACCGUGCACC AUGAGCACAA AUCCUAAACC 360 

UCAAAGAAAA ACCAAAAGAA ACACCAACCG UCGCCCACAA GACGUUAAGU UUCCGGGCGG 420 

CGGCCAGAUC GUUGGCGGAG UAUACUUGUU GCCGCGCAGG GGCCCCAGGU UGGGUGUGCG 480 

CGCGACAAGG AAGACUUCGG AGCGGUCCCA GCCACGUGGA AGGCGCCAGC CCAUCCCUAA 540 

GGALICGGCGC UCCACUGGCA AAUCCUGGGG AAAACCAGGA UACCCCUGGC CCCUAUACGG 600 

GAAUGAGGGA CUCGGCUGGG CAGGAUGGCU CCUGUCCCCC CGAGGUUCCC GUCCCUCUUG 660 

GGGCCCCAAU GACCCCCGGC AUAGGUCCCG CAACGUGGGU AAGGUCAUCG AUACCCUAAC 720 

GUGCGGCUUU GCCGACCUCA UGGGGUACAU CCCUGUCGUA GGCGCCCCGC UCGGCGGCGU 780 

CGCCAGAGCU CUCGCGCAUG GCGUGAGAGU CCUGGAGGAC GGGGUUAAUU UUGCAACAGG 840 

GAACUUACCC GGUUGCUCCU UUUCUAUCUU CUUGCUGGCC CUGCUGUCCU GCAUCACCAC 900 

CCCGGUCUCC GCUGCCGAAG UGAAGAACAU CAGUACCGGC UACAUGGUGA CCAACGACUG 960 

CACCAAUGAU AGCAUUACCU GGCAACUCCA GGCUGCUGUC CUCCACGUCC CCGGGUGCGU 1020 

CCCGUGCGAG AAAGUGGGGA AUACAUCUCG GUGCUGGAUA CCGGUCUCAC CGAAUGUGGC 1080 

CGUGCAGCAG CCCGGCGCCC UCACGCAGGG CUUACGGACG CACAUUGACA UGGUUGUGAU 1140 
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GUCCGCCACG 


CUCUGCUCCG 


CUCUUUACGU 


GGGGGACCX 


UGCGGUGGGG 


UGAUGCUUGC 


1200 


AGCCCAQAUG 


UJCAUUGUCL) 


CGCCACAGCA 


CCACUGGUUU 


GUGCAAGACU 


GCAAUUGCUC 


1260 


CALfCUACCCU 


GGUACCAJCA 


CUGGACACCG 


CAUGGCGUGG 


GACAUGAUGA 


UGAACUGGUC 


1320 


GCCCACGGCU 


ACCAUGAUCC 


UGGCGUACGC 


GAUGCGCGUC 


CCCGAGGUCA 


UCAUAGACAU 


1380 


CAUUGGCGGG 


GCUCAUUGGG 


GCGUCAUGUU 


CGGCUUAGCC 


UACUUCUCUA 


UGCAGGGAGC 


1440 


GUGGGCAAAA 


GLiCGUUGUCA 


JUCUUUUGCU 


GGCCGCCGGG 


GUGGACGCGC 


AAACCCAUAC 


1500 


CGUUGGGGGU 


UCUACCGCGC 


AUAACGCCAG 


GACCCUCACC 


GGCAUGUUCU 


CCCUUGGUGC 


1560 


CAGGCAGAAA 


AUCCAGCUCA 


UCAACACCAA 


UGGCAGUUGG 


CACAUCAACC 


GCACCGCCCU 


1620 


GAACUGCAAU 


GACUCUUJGC 


ACACCGGCUU 


CCUCGCGUCA 


CUGUUCUACA 


CCCACAGCUU 


1680 


CAACUCGUCA 


GGAUGUCCCG 


AACGCAUGUC 


CGCCUGCCGC 


AGUAUCGAGG 


CCUUUCGGGU 


1740 


GGGAUGGGGC 


GCCUUACAAU 


AUGAGGACAA 


UGUCACCAAU 


CCAGAGGAUA 


UGAGACCGUA 


1800 


UUGCUGGCAC 


UACCCACCAA 


GACAGUGUGG 


UGUAGUCUCC 


GCGAGCUCUG 


UGUGUGGCCC 


1860 


AGUGUACUGU 


UUCACCCCCA 


GCCCAGUAGU 


AGUGGGUACG 


ACCGAUAGAC 


UUGGAGCGCC 


1920 


CACUUACACG 


UGGGGGGAGA 


AUG AG AC AG A 


UGUCUUCCUA 


UUGAACAGCA 


CUCGACCACC 


1980 


GCAGGGGUCA 


UGGJUCGGCU 


GCACGUGGAU 


GAACUCCACU 


GGCUACACCA 


AGACUUGCGG 


2040 


CGCACCACCC 


UGCCGCAUUA 


GAGCUGACUU 


CAAUGCCAGC 


AUGGACUUGU 


UGUGCCCCAC 


2100 


GGACUGUUUU 


AGGAAGCAUC 


CUGAUACCAC 


CUACAUCAAA 


UGUGGCUCUG 


GGCCCUGGCU 


2160 


CACGCCAAGG 


UGCCUGAUCG 


ACUACCCCUA 


CAGGCUCUGG 


CAUUACCCCU 


GCACAGUUAA 


2220 


CUAUACCAUC 


UUCAAAAUAA 


GGAUGUAUGU 


GGGGGGGG UC 


GAGCACAGGC 


UCACGGCUGC 


2280 


GUGCAAUUUC 


ACUCGUGGGG 


AUCGUUGCAA 


CUUGGAGGAC 


AGAGACAGAA 


GUCAACUGUC 


2340 


JCCUUUGCUG 


CACUCCACCA 


CGGAGUGGGC 


CAUUUUACCU 


UGCACUUACU 


CGGACCUGCC 


2400 


CGCCUUGUCG 


ACUGGUCJUC 


UCCACCUCCA 


CCAAAACAUC 


GUGGACGUGC 


AAUUCAUGUA 


2460 


UGGCCUAUCA 


CCUGCUCJCA 


CAAAAUACAU 


CGUCCGAUGG 


GAGUGGG UAG 


UACUCUUAUU 


2520 


CCUGCUCUUA 


GCGGACGCCA 


GGGUUUGCGC 


CUGCUUAUGG 


AUGCUCAUCU 


UGUUGGGCCA 


2580 


GGCCGAAGCA 


GCACUAGAGA 


AGUUGGUCGU 


CUUGCACGCU 


GCGAGCGCAG 


CUAGCUGCAA 


2640 


UGGCUUCCUA 


UACJUJGUCA 


UCUUUUUCGU 


GGCUGCUUGG 


UACAUCAAGG 


GUCGGGUAGU 


2700 


CCCCUUGGCU 


ACUUAJUCCC 


UCACUGGCCU 


AUGGUCCUUU 


GGCCUACUGC 


UCCUAGCAUU 


2760 


GCCCCAACAG 


GCUUAUGCUU 


AUGACGCAJC 


UGUACAUGGU 


CAGAUAGGAG 


CAGCUCUGUU 


2820 


GGJACUGAUC 


ACUCUCUUUA 


CACUCACCCC 


CGGGUAUAAG 


ACCCUUCUCA 


GCCGGUUUCU 


2880 
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GUGGUGGUUG UGCUAUCUUC UGACCCUGGC GGAAGCUAUG GUCCAGGAGU GGGCACCACC 2940 
UAUGCAGGUG CGCGGUGGCC GUGAUGGGAU CAUAUGGGCC GUCGCCAUAU UCUGCCCGGG 3000 
UGUGGUGUUU GACAUAACCA AGUGGCUCUU GGCGGUGCUU GGGCCUGCUU AUCUCCUAAA 3060 
AGGUGCUUUG ACGCGUGUGC CGUACUUCGU CAGGGCUCAC GCUCUACUAA GGAUGUGCAC 3120 
CAUGGUAAGG CAUCUCGCGG GGGGUAGGUA CGUCCAGAUG GUGCUACUAG CCCUUGGCAG 3180 
GUGGACUGGC ACUUACAUCU AUGACCACCU CACCCCUAUG UCGGAUUGGG CUGCUAAUGG 3240 
CCUGCGGGAC UUGGCGGUCG CCGUGGAGCC UAUCAUCUUC AGUCCGAUGG AGAAAAAAGU 3300 
CAUCGUCUGG GGAGCGGAGA CAGCUGCUUG CGGGGAUAUC UUACACGGAC UUCCCGUGUC 3360 
CGCCCGACUU GGCCGGGAGG UCCUCCUUGG CCCAGCUGAU GGCUAUACCU CCAAGGGGUG 3420 
GAGUCUUCUC GCCCCCAUCA CUGCUUAUGC CCAGCAGACA CGCGGCCUUU UGGGCACCAU 3480 
AGUGGUGAGC AUGACGGGGC GCGACAAGAC AGAACAGGCC GGGGAGAUUC AGGUCCUGUC 3540 
CACGGUCACU CAGUCCUUCC UCGGAACAAC CAUCUCGGGG GUCUUAUGGA CUGUCUACCA 3600 
UGGAGCUGGC AACAAGACUC UAGCCGGCUC ACGGGGUCCG GUCACACAGA UGUACUCCAG 3660 
UGCUGAGGGG GACUUAGUGG GGUGGCCCAG CCCCCCCGGG ACCAAAUCUU UGGAGCCGUG 3720 
CACGUGUGGA GCGGUCGACC UAUACCUGGU CACGCGAAAC GCUGAUGUCA UCCCGGCUCG 3780 
AAGACGCGGG GACAAGCGAG GAGCGCUACU CUCCCCGAGA CCUCUUUCCA CCUUGAAGGG 3840 
GUCCUCGGGG GGCCCGGUGC UCUGCCCCAG AGGCCACGCU GUCGGGGUCU UCCGGGCAGC 3900 
CGUGUGCUCC CGGGGCGUGG CCAAGUCCAU AGAUUUUAUC CCCGUUGAGA CACUUGACAU 3960 
CGUCACUCGG UCCCCCACCU UUAGUGACAA CAGCACACCA CCUGCUGUGC CCCAAACUUA 4020 
UCAGGUCGGG UACUUACAUG CCCCGACUGG UAGUGGAAAG AGCACCAAAG UCCCUGUCGC 4080 
GUAUGCCGCU CAGGGGUACA AAGUGCUAGU GCUUAAUCCC UCGGUGGCUG CCACCCUGGG 4140 
GUUUGGGGCG UACUUG'JCCA AGGCACAUGG CAUCAAUCCC AACAUUAGGA CUGGGGUCAG 4200 
GACUGUGACG ACCGGGGCGC CCAUCACGUA CUCCACAUAU GGCAAAUUCC UCGCCGAUGG 4260 
GGGCUGCGCA GGCGGCGCCU AUGACAUCAU CAUAUGCGAU GAAUGCCAUG CCGUGGACUC 4320 
UACCACCAUU CUCGGCAUCG GAACAGUCCU CGAUCAAGCA GAGACAGCCG GGGUCAGGCU 4380 
AACUGUACUG GCUACGGCUA CGCCCCCCGG GUCAGUGACA ACCCCCCACC CCAACAUAGA 4440 
GGAGGUGGCC CUCGGGCAGG AGGGUGAGAU CCCCUUCUAU GGGAGGGCGA UUCCCCUGUC 4500 
AUACAUCAAG GGAGGAAGAC ACUUGAUCUU CUGCCACUCA AAGAAAAAGU GUGACGAGCU 4560 
CGCGGCGGCC CUUCGGGGUA UGGGCUUGAA CGCAGUGGCA UACUACAGAG GGCUGGACGU 4620 
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CUCCGUAAUA CCAACUCAGG GAGACGUAGU GGUCGUCGCC ACCGACGCCC UCAUGACGGG 4680 
GUUUACUGGA GACUUUGACU CCGUGAUCGA CUGCAACGUA GCGGJCACUC AAGUUGUAGA 4740 
CUUCAGCUUG GACCCCACAU UCACCAUAAC CACACAGAC'J GUCCCUCAAG ACGCUGUCUC 4800 
ACGUAGCCAG CGCCGGGGCC GCACGGGCAG GGGAAGACUG GGUAliUUAUA GGUAUGUUUC 4860 
CACUGGUGAG CGAGCCUCAG GAAUGUUUGA CAGUGUAGUG CUCUGCGAGU GCUACGAJGC 4920 
AGGGGCCGCA UGGUAUGAGC UCACACCAGC GGAGACCACC GUCAGGCUCA GAGCAUAUUU 4980 
CAACACACCU GGUUUGCCUG UGUGCCAAGA CCAUCUUGAG UUUUGGGAGG CAGUUUUCAC 5040 
CGGCCUCACA CACAUAGAUG CCCACUUCCU UUCCCAAACA AAGCAAUCGG GGGAAAAUUU 5100 
CGCAUACUUA ACAGCCUACC AGGCUACAGU GUGCGCUAGG GCCAAAGCCC CCCCCCCGUC 5160 
CUGGGACGUC AUGUGGAAGU GUUUGACUCG ACUCAAGCCC ACACUCGUGG GCCCCACACC 5220 
UCUCCUGUAC CGCUUGGGCU CUGUUACCAA CGAGGUCACC CUCACGCAUC CUGUGACGAA 5280 
AUACAUCGCC ACCUGCAUGC AAGCCGACCU UGAGGUCAUG ACCAGCACGU GGGUCUUAGC 5340 
UGGGGGGGUC UUGGCGGCCG UCGCCGCGUA CUGCCUGGCG ACCGGGUGUG UUUGCAUCAU 5400 
CGGCCGCUUG CACGUUAACC AGCGAGCCGU CGUUGCACCG GACAAGGAGG UCCUCUAUGA 5460 
GGCUUUUGAU GAGAUGGAGG AAUGUGCCUC UAGAGCGGCU CUCAUUGAAG AGGGGCAGCG 5520 
GAUAGCCGAG AUGCUGAAGU CCAAGAUCCA AGGC LIUAUUG CAGCAAGCUU CCAAACAAGC 5580 
UCAAGACAUA CAACCCGCUG UGCAGGCUUC UUGGCCCAAG GUAGAGCAAU UCUGGGCCAA 5640 
ACACAUGUGG AACUUCAUCA GCGGCAUUCA AUACCUCGCA GGACUAUCAA CACUGCCAGG 5 700 
GAACCCUGCU GUAGCUUCCA UGAUGGCAUU CAGUGCCGCC CUCACCAGUC CGUUGUCAAC 5 760 
MAGCACCACU AUCCJUCUCA ACAUUUUGGG GGGCUGGCUA GCAUCCCAAA UUGCGCCUCC 5820 
CGCGGGGGCU ACCGGCUUCG UCGUCAGUGG CCUGGUGGGG GCUGCCGUAG GCAGCAUAGG 5880 
CUUGGGUAAG GUGCUGGUGG ACAUCCUGGC AGGGUAUGGU GCGGGCAUUU CGGGGGCUCU 5940 
CGUCGCAUUC AAGAUCAUGU CUGGCGAGAA GCCCUCCAUG GAGGAUGUUG UCAACCUGCU 6000 
GCC'JGGAAUU CUGUCUCCGG GUGCCCUGGU GGUGGGAGUC AUCUGCGCGG CCAUCCUGCG 6060 
CCGACACGUG GGACCGGGGG AAGGCGCUGU CCAAUGGAUG AAUAGGCUCA UUGCCUUUGC 6120 
UJCCAGAGGA AACCACGUCG CCCCCACCCA CUACGUGACG GAGUCGGAUG CGUCGCAGCG 6180 
UGUGACCCAA CUACUUGGCU CCCUUACCAU AACCAGCCUG CUCAGGAGAC UCCACAACUG 6240 
GAUUACUGAA GACUGCCCCA UCCCAUGCAG CGGCUCGUGG CUCCGCGAUG UGUGGGAUUG 6300 
GGUUUGCACC AUCCUAACAG ACUUUAAAAA CUGGCUGACC UCCAAAUUGU UCCCAAAGAU 6360 
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GCCUGGUCUC CCCUUUAUCU CUUGUCAAAA GGGGUACAAG GGCGUGUGGG CUGGCACUGG 6420 
UAUCAUGACC ACACGGUGUC CUUGCGGCGC CAAUAUCJCU GGCAAUGUCC GCCUGGGCUC 6480 
CAUGAGAAUU ACGGGGCCCA AAACCUGCAU GAAUAUCUGG CAGGGGACCU UUCCCAUCAA 6540 
UUGUUACACG GAGGGCCAGU GCGUGCCGAA ACCCGCACCA AACUUUAAGA UCGCCAUCUG 6600 
GAGGGUGGCG GCCUCAGAGU ACGCGGAGGU GACGCAGCAC GGGUCAUACC ACUACAUAAC 6660 
AGGACUUACC ACUGAUAACU UGAAAGUUCC UUGCCAACUA CCUUCUCCAG AGUUCUUUUC 6720 
CUGGGUGGAC GGAGUGCAGA UCCAUAGGUU UGCCCCCAUA CCGAAGCCGU UUUUUCGGGA 6780 
UGAGGUCUCG UUCUGCGUUG GGCUUAAUUC AUUUGUCGUC GGGUCUCAGC UCCCUUGCGA 6840 
UCCUGAACCU GACACAGACG UAUUGACGUC CAUGCUAACA GACCCAUCCC AUAliCACGGC 6900 
GGAGACUGCA GCGCGGCGUU UGGCACGGGG GUCACCCCCG UCCGAGGCAA GCUCCUCAGC 6960 
GAGCCAGCUA UCGGCACCAU CGCUGCGAGC CACCUGCACC ACCCACGGCA AGGCCUAUGA 7020 
UGUGGACAUG GUGGAUGCCA ACCUGUUCAU GGGGGGCGAU GUGACCCGGA UAGAGUCUGA 7080 
GUCCAAAGUG GUCGUUCUGG ACUCUCUCGA CCCAAUGGUC GAAGAAAGGA GCGACCUUGA 7140 
GCCUUCGAUA CCAUCGGAAU AUAUGCUCCC CAAGAAGAGA UOCCCACCAG CCUUACCGGC 7200 
UUGGGCACGG CCUGAUUACA ACCCACCGCU UGUGGAAUCG UGGAAGAGGC CAGAUUACCA 7260 
ACCGGCCACU GUUGCGGGCU GCGCUCUCCC CCCCCCUAAG AAAACCCCGA CGCCUCCCCC 7320 
AAGGAGACGC CGGACAGUGG GUCUGAGUGA GAGCUCCAUA GCAGAUGCCC UACAACAGCU 7380 
GGCCAUCAAG UCCUUUGGCC AGCCCCCCCC AAGCGGCGAU UCAGGCCUUU CCACGGGGGC 7440 
GGACGCAGCC GAUUCCGGCA GUCGGACGCC CCCCGAUGAG UUGGCCCUUU CGGAGACAGG 7500 
UUCCAUCUCC UCCAUGCCCC CUCUCGAGGG GGAGCCUGGA GAUCCAGACU UGGAGCCUGA 7560 
GCAGGUAGAG CUUCAACCUC CCCCCCAGGG GGGGGUGGUA ACCCCCGGCU CAGGCUCGGG 7620 
GUCUUGGUCU ACUUGCUCCG AGGAGGACGA CUCCGUCGUG UGCUGCUCCA UGUCAUACUC 7680 
CUGGACCGGG GCUCUAAUAA CUCCUUGUAG CCCCGAAGAG GAAAAGUUGC CAAUUGGCCC 7740 
CUUGAGCAAC UCCCUGUUGC GAUAUCACAA CAAGGUGUAC UGUACCACAU CAAAGAGCGC 7300 
CUCAUUAAGG GCUAAAAAGG UAACUUUUGA UAGGAUGCAA GCGCUCGACG CUCAUUAUGA 7860 
CUCAGUCUUG AAGGACAUUA AGCUAGCGGC CUCCAAGGUC ACCGCAAGGC UUCUCACUUU 7920 
AGAGGAGGCC UGCCAGUUAA CUCCACCCCA CUCUGCAAGA UCCAAGUAUG GGUUUGGGGC 7980 
UAAGGAGGUC CGCAGCUUGU CCGGGAGAGC CGUUAACCAC AUCAAGUCCG UGUGGAAGGA 8040 
CCUCCUGGAA GACACACAAA CACCAAUUCC UACAACCAUC AUGGCCAAAA AUGAGGUGUU 8100 
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CUGCGUGGAC CCCACCAAGG GGGGUAAGAA AGCAGCUCGC CUUAUCGUUU ACCCUGACCU 8160 
CGGCGUCAGG GUCUGCGAGA AAAUGGCCCU UUAUGAUAUC ACACAAAAGC UUCCUCAGGC 8220 
GGUGAUGGGG GCUUCUUAUG GAUUCCAGUA CUCCCCCGCU CAGCGGGUGG AGUUUCUCUU 8280 
GAAGGCAUGG GCGGAAAAGA AAGACCCUAU GGGUUUUUCG UAUGAUACCC GAUGCUUUGA 8340 
CUCAACCGUC ACliGAGAGAG ACAUCAGGAC UGAGGAGUCC AUAUAUCGGG CUUGUUCCUU 8400 
GCCCGAGGAG GCCCACACUG CCAUACACUC ACUGACUGAG AGACUUUACG UGGGAGGGCC 8460 
CAUGUUCAAC AGCAAGGGCC AGACCUGCGG GUACAGGCGU UGCCGCGCCA GCGGGGUGCU 8520 
UACCACUAGC AUGGGGAACA CCAUCACAUG CUAUGUGAAA GCCUUAGCGG CCUGUAAGGC 8580 
UGCAGGGAUA AUUGCGCCCA CAAUGCUGGU AUGCGGCGAU GACUUGGUUG UCAUCUCAGA 8640 
GAGCCAGGGG ACCGAGGAGG ACGAGCGGAA CCUGAGAGCC UUCACGGAGG CUAUGACCAG 8700 
GUAUUCUGCC CCUCCUGGUG ACCCCCCCAG ACCGGAAUAU GACCUGGAGC UGAUAACAUC 8760 
UUGCUCCUCA AAUGUGUCUG UGGCGUUGGG CCCACAAGGC GGCCGCAGAU ACUACCUGAC 8820 
CAGAGACCCU ACCACUCCAA UCGCCCGGGC UGCCUGGGAA ACAGUUAGAC ACUCCCCUGU 8880 
CAAUUCAUGG CUAGGAAACA UCAUCCAGUA CGCCCCAACC AUAUGGGCUC GCAUGGUCCU 8940 
GAUGACACAC UUCUUCUCCA UUCUCAUGGC CCAAGAUACU CUGGACCAGA ACCUCAACUU 9000 
UGAGAUGUAC GGAGCGGUGU ACUCCGUGAG UCCCUUGGAC CUCCCAGCCA UAAUUGAAAG 9060 
GUUACACGGG CUUGACGCUU UCUCUCUGCA CACAUACACU CCCCACGAAC UGACACGGGU 9120 
GGCUUCAGCC CUCAGAAAAC UUGGGGCGCC ACCCCUCAGA GCGUGGAAGA GCCGGGCACG 9180 
UGCAGUCAGG GCGUCCCUCA UCUCCCGUGG GGGGAGAGCG GCCGUUUGCG GCCGAUAUCU 9240 
CUUCAACUGG GCGGUGAAGA CCAAGCUCAA ACUCACUCCA UUGCCGGAAG CGCGCCUCCU 9300 
GGAUUUAUCC AGCUGGUUCA CUGUCGGCGC CGGCGGGGGC GACAUUUAUC ACAGCGUGUC 9360 
GCGUGCCCGA CCCCGCUUAU UACUCCUUGG CCUACUCCUA CUUUUUGUAG GGGUAGGCCU 9420 
UUUCCUACUC CCCGCUCGGU AGAGCGGCAC ACAUUAGCUA CACUCCAliAG CUAACUGUCC 9480 
CUUUUUUUUU UUUUUUUUUU UUUUUUUUUU UUUUUUUUUU UUUUUUUUUU UUUUUUUUUU 9540 
UUUUUUUUUU UUUUUUUUUU UUUUUUUUUU UUUUUUUUUU UUUUUUUUU 9589 
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Sequence ID No. 2 
Sequence Length: 9,589 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



ACCCGCCCCT AATAGGGGCG ACACTCCGCC ATGAACCACT CCCCTGTGAG GAACTACTGT 60 

CTTCACGCAG AAAGCGTCTA GCCATGGCGT TAGTATGAGT GTCGTACAGC CTCCAGGCCC 120 

CCCCCTCCCG GGAGAGCCAT AGTGGTCTGC GGAACCGGTG AGTACACCGG AATTGCCGGG 180 

AAGACTGGGT CCTTTCTTGG ATAAACCCAC TCTATGCCCG GTCATTTGGG CGTGCCCCCG 240 

CAAGACTGCT AGCCGAGTAG CGTTGGGTTG CGAAAGGCCT TGTGGTACTG CCTGATAGGG 300 

TGCTTGCGAG TGCCCCGGGA GGTCTCGTAG ACCGTGCACC ATGAGCACAA ATCCTAAACC 360 

TCAAAGAAAA ACCAAAAGAA ACACCAACCG TCGCCCACAA GACGTTAAGT TTCCGGGCGG 420 

CGGCCAGATC GTTGGCGGAG TATACTTGTT GCCGCGCAGG GGCCCCAGGT TGGGTGTGCG 480 

CGCGACAAGG AAGACTTCGG AGCGGTCCCA GCCACGTGGA AGGCGCCAGC CCATCCCTAA 540 

GGATCGGCGC TCCACTGGCA AATCCTGGGG AAAACCAGGA TACCCCTGGC CCCTATACGG 600 

GAATGAGGGA CTCGGCTGGG CAGGATGGCT CCTGTCCCCC CGAGGTTCCC GTCCCTCTTG 660 

GGGCCCCAAT GACCCCCGGC ATAGGTCCCG CAACGTGGGT AAGGTCATCG ATACCCTAAC 720 

GTGCGGCTTT GCCGACCTCA TGGGGTACAT CCCTGTCGTA GGCGCCCCGC TCGGCGGCGT 780 

CGCCAGAGCT CTCGCGCATG GCGTGAGAGT CCTGGAGGAC GGGGTTAATT TTGCAACAGG 840 

GAACTTACCC GGTTGCTCCT TTTCTATCTT CTTGCTGGCC CTGCTGTCCT GCATCACCAC 900 
CCCGGTCTCC GCTGCCGAAG TGAAGAACAT CAGTACCGGC TACATGGTGA CCAACGACTG 960 

CACCAATGAT AGCATTACCT GGCAACTCCA GGCTGCTGTC CTCCACGTCC CCGGGTGCGT 1020 

CCCGTGCGAG AAAGTGGGGA ATACATCTCG GTGCTGGATA CCGGTCTCAC CGAATGTGGC 1080 

CGTGCAGCAG CCCGGCGCCC TCACGCAGGG CTTACGGACG CACATTGACA TGGTTGTGAT 1140 

GTCCGCCACG CTCTGCTCCG CTCTTTACGT GGGGGACCTC TGCGGTGGGG TGATGCTTGC 1200 
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AGCCCAGATG 


TTCATTGTCT 


CGCCACAGCA 


CCACTGGTTT 


GTGCAAGACT 


GCAATTGCTC 


1260 


CATCTACCCT 


GGTACCATCA 


CTGGACACCG 


CATGGCGTGG 


GACATGATGA 


TGAACTGGTC 


1320 


GCCCACGGCT 


ACCATGATCC 


TGGCGTACGC 


GATGCGCGTC 


CCCGAGGTCA 


TCATAGACAT 


1380 


CATTGGCGGG 


GCTCATTGGG 


GCGTCATGTT 


CGGCTTAGCC 


TACTTCTCTA 


TGCAGGGAGC 


1440 


GTGGGCAAAA 


GTCGTTGTCA 


TTCTTTTGCT 


GGCCGCCGGG 


GTGGACGCGC 


AAACCCATAC 


1500 


CGTTGGGGGT 


TCTACCGCGC 


ATAACGCCAG 


GACCCTCACC 


GGCATGTTCT 


CCCTTGGTGC 


1560 


CAGGCAGAAA 


ATCCAGCTCA 


TCAACACCAA 


TGGCAGTTGG 


CACATCAACC 


GCACCGCCCT 


1620 


GAACTGCAAT 


GACTCTTTGC 


ACACCGGCTT 


CCTCGCGTCA 


CTGTTCTACA 


CCCACAGCTT 


1680 


CAACTCGTCA 


GGATGTCCCG 


AACGCATGTC 


CGCCTGCCGC 


AGTATCGAGG 


CCTTTCGGGT 


1740 


GGGATGGGGC 


GCCTTACAAT 


ATGAGGACAA 


TGTCACCAAT 


CCAGAGGATA 


TGAGACCGTA 


1800 


TTGCTGGCAC 


TACCCACCAA 


GACAGTGTGG 


TGTAGTCTCC 


GCGAGCTCTG 


TGTGTGGCCC 


1860 


AGTGTACTGT 


TTCACCCCCA 


GCCCAGTAGT 


AGTGGGTACG 


ACCGATAGAC 


TTGGAGCGCC 


1920 


CACTTACACG 


TGGGGGGAGA 


ATGAGACAGA 


TGTCTTCCTA 


TTGAACAGCA 


CTCGACCACC 


1980 


GCAGGGGTCA 


TGGTTCGGCT 


GCACGTGGAT 


GAACTCCACT 


GGCTACACCA 


AGACTTGCGG 


2040 


CGCACCACCC 


TGCCGCATTA 


GAGCTGACTT 


CAATGCCAGC 


ATGGACTTGT 


TGTGCCCCAC 


2100 


GGACTGTTTT 


AGGAAGCATC 


CTGATACCAC 


CTACATCAAA 


TGTGGCTCTG 


GGCCCTGGCT 


2160 


CACGCCAAGG 


TGCCTGATCG 


ACTACCCCTA 


CAGGCTCTGG 


CATTACCCCT 


GCACAGTTAA 


2220 


CTATACCATC 


TTCAAAATAA 


GGATGTAIGT 


GGGGGGGGTC 


GAGCACAGGC 


TCACGGCTGC 


2280 


GTGCAATTTC 


ACTCGTGGGG 


ATCGTTGCAA 


CTTGGAGGAC 


AGAGACAGAA 


GTCAACTGTC 


2340 


TCCTTTGCTG 


CACTCCACCA 


CGGAGTGGGC 


CATTTTACCT 


TGCACTTACT 


CGGACCTGCC 


2400 


CGCCTTGTCG 


ACTGGTCTTC 


TCCACCICCA 


CCAAAACAIC 


GTGGACGTGC 


AATTCATGTA 

Mill i \y i i i \4 ii\ 


2460 


TGGCCTATCA 


CCTGPTCTCA 




V/U 1 l/l/Un 1 UU 


unu 1 uuu I Hu 


TAPTrTTA TT 




CCTGCTCTTA 


GCGGACGCCA 


GGGTTTGCGC 


CTGCTTATGG 


ATGCTCATCT 


TGTTGGGCCA 


2580 


GGCCGAAGCA 


GCACTAGAGA 


AGTTGGTCGT 


CTTGCACGCT 


GCGAGCGCAG 


CTAGCTGCAA 


2640 


TGGCTTCCTA 


TACTTTGTCA 


TCTTTTTCGT 


GGCTGCTTGG 


TACATCAAGG 


GTCGGGTAGT 


2700 


CCCCTTGGCT 


ACTTATTCCC 


TCACTGGCCT 


ATGGTCCTTT 


GGCCTACTGC 


TCCTAGCATT 


2760 


GCCCCAACAG 


GCT7ATGCTT 


ATGACGCATC 


TGTACATGGT 


CAGATAGGAG 


CAGCTCTGTT 


2820 


GGTACTGATC 


ACTCTCTTTA 


CACTCACCCC 


CGGGTATAAG 


ACCCTTCTCA 


GCCGGTTTCT 


2880 


GTGGTGGTTG 


TGCTATCTTC 


TGACCCTGGC 


GGAAGCTATG 


GTCCAGGAGT 


GGGCACCACC 


2940 
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TATGCAGGTG CGCGGTGGCC GTGATGGGAT 
TGTGGTGTTT GACATAACCA AGTGGCTCTT 
AGGTGCTTTG ACGCGTGTGC CGTACTTCGT 
CATGGTAAGG CATCTCGCGG GGGGTAGGTA 
GTGGACTGGC ACTTACATCT ATGACCACCT 
CCTGCGGGAC TTGGCGGTCG CCGTGGAGCC 
CATCGTCTGG GGAGCGGAGA CAGCTGCTTG 
CGCCCGACTT GGCCGGGAGG TCCTCCTTGG 
GAGTCTTCTC GCCCCCATCA CTGCTTATGC 
AGTGGTGAGC ATGACGGGGC GCGACAAGAC 
CACGGTCACT CAGTCCTTCC TCGGAACAAC 
TGGAGCTGGC AACAAGACTC TAGCCGGCTC 
TGCTGAGGGG GACTTAGTGG GGTGGCCCAG 
CACGTGTGGA GCGGTCGACC TATACCTGGT 
AAGACGCGGG GACAAGCGAG GAGCGCTACT 
GTCCTCGGGG GGCCCGGTGC TCTGCCCCAG 
CGTGTGCTCC CGGGGCGTGG CCAAGTCCAT 
CGTCACTCGG TCCCCCACCT TTAGTGACAA 
TCAGGTCGGG TACTTACATG CCCCGACTGG 
GTATGCCGCT CAGGGGTACA AAGTGCTAGT 
GTTTGGGGCG TACTTGTCCA AGGCACATGG 
GACTGTGACG ACCGGGGCGC CCATCACGTA 
GGGCTGCGCA GGCGGCGCCT ATGACATCAT 
TACCACCATT CTCGGCATCG GAACAGTCCT 
AACTGTACTG GCTACGGCTA CGCCCCCCGG 
GGAGGTGGCC CTCGGGCAGG AGGGTGAGAT 
ATACATCAAG GGAGGAAGAC ACTTGATCTT 
CGCGGCGGCC CTTCGGGGTA TGGGCTTGAA 
CTCCGTAATA CCAACTCAGG GAGACGTAGT 



CATATGGGCC 


GTCGCCATAT 


TCTGCCCGGG 


3000 


GGCGGTGCTT 


GGGCCTGCTT 


ATCTCCTAAA 


3060 


CAGGGCTCAC 


GCTCTACTAA 


GGATGTGCAC 


3120 


CGTCCAGATG 


GTGCTACTAG 


CCCTTGGCAG 


3180 


CACCCCTATG 


TCGGATTGGG CTGCTAATGG 


3240 


TATCATCTTC 


AGTCCGATGG 


AGAAAAAAGT 


3300 


CGGGGATATC 


TTACACGGAC 


TTCCCGTGTC 


3360 


CCCAGCTGAT 


GGCTATACCT CCAAGGGGTG 


3420 


CCAGCAGACA 


CGCGGCCTTT 


TGGGCACCAT 


3480 


AGAACAGGCC 


GGGGAGATTC 


AGGTCCTGTC 


3540 


CATCTCGGGG 


GTCTTATGGA 


CTGTCTACCA 


3600 


ACGGGGTCCG 


GTCACACAGA 


TGTACTCCAG 


3660 


CCCCCCCGGG 


ACCAAATCTT 


TGGAGCCGTG 


3720 


CACGCGAAAC 


GCTGATGTCA 


TCCCGGCTCG 


3780 


CTCCCCGAGA 


CCTCTTTCCA 


CCTTGAAGGG 


3840 


AGGCCACGCT 


GTCGGGGTCT 


TCCGGGCAGC 


3900 


AGATTTTATC 


CCCGTTGAGA 


CACTTGACAT 


3960 


CAGCACACCA 


CCTGCTGTGC 


CCCAAACTTA 


4020 


TAGTGGAAAG 


AGCACCAAAG 


TCCCTGTCGC 


4080 


GCTTAATCCC 


TCGGTGGCTG 


CCACCCTGGG 


4140 


CATCAATCCC 


AACATTAGGA 


CTGGGGTCAG 


4200 


CTCCACATAT 


GGCAAATTCC 


TCGCCGATGG 


4260 


CATATGCGAT 


GAATGCCATG 


CCGTGGACTC 


4320 


CGATCAAGCA 


GAGACAGCCG 


GGGTCAGGCT 


4380 


GTCAGTGACA 


ACCCCCCACC 


CCAACATAGA 


4440 


CCCCTTCTAT 


GGGAGGGCGA 


TTCCCCTGTC 


4500 


CTGCCACTCA 


AAGAAAAAGT GTGACGAGCT 


4560 


CGCAGTGGCA 


TACTACAGAG 


GGCTGGACGT 


4620 


GGTCGTCGCC 


ACCGACGCCC 


TCATGACGGG 


4680 
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GTTTACTGGA 


GACTTTGACT 


CCGTGATCGA 


CTGCAACGTA 


GCGGTCACTC 


AAGTTGTAGA 


4 740 


CTTCAGCTTG 


GACCCCACAT 


TCACCATAAC 


CACACAGACT 


GTCCCTCAAG 


ACGCTGTCTC 


4800 


ACGTAGCCAG 


CGCCGGGGCC 


GCACGGGCAG 


GGGAAGACTG 


GGTATTTATA 


GGTATGTTTC 


4860 


CACTGGTGAG 


CGAGCCTCAG 


GAATGTTTGA 


CAGTGTAGTG 


CTCTGCGAGT 


GCTACGATGC 


4920 


AGGGGCCGCA 


TGGTATGAGC 


TCACACCAGC 


GGAGACCACC 


GTCAGGCTCA 


GAGCATATTT 


4980 


CAACACACCT 


GGTTTGCCTG 


TGTGCCAAGA 


CCATCTTGAG 


TTTTGGGAGG 


CAGTTTTCAC 


5040 


CGGCCTCACA 


CACATAGATG 


CCCACTTCCT 


TTCCCAAACA 


AAGCAATCGG 


GGGAAAATIT 


5100 


CGCATACTTA 


ACAGCCTACC 


AGGCTACAGT 


GTGCGCTAGG 


GCCAAAGCCC 


CCCCCCCGTC 


5160 


CTGGGACGTC 


ATGTGGAAGT 


GTTTGACTCG 


ACTCAAGCCC 


ACACTCGTGG 


GCCCCACACC 


5220 


TCTCCTGTAC 


CGCTTGGGCT 


CTGTTACCAA 


CGAGGTCACC 


CTCACGCATC 


CTGTGACGAA 


5280 


ATACATCGCC 


ACCTGCATGC 


AAGCCGACCT 


TGAGGTCATG 


ACCAGCACGT 


GGGTCTTAGC 


5340 


TGGGGGGGTC 


TTGGCGGCCG 


TCGCCGCGTA 


CTGCCTGGCG 


ACCGGGTGTG 


TTTGCATCAT 


5400 


CGGCCGCTTG 


CACGTTAACC 


AGCGAGCCGT 


CGTTGCACCG 


GACAAGGAGG 


TCCTCTATGA 


5460 


GGCTTTTGAT 


GAGATGGAGG 


AATGTGCCTC 


TAGAGCGGCT 


CTCATTGAAG 


AGGGGCAGCG 


5520 


GATAGCCGAG 


ATGCTGAAGT 


CCAAGATCCA 


AGGCTTATTG 


CAGCAAGCTT 


CCAAACAAGC 


5580 


TCAAGACATA 


CAACCCGCTG 


TGCAGGCTTC 


TTGGCCCAAG 


GTAGAGCAAT 


TCTGGGCCAA 


5640 


ACACATGTGG 


AACTTCATCA 


GCGGCATTCA 


ATACCTCGCA 


GGACTATCAA 


CACTGCCAGG 


5700 


GAACCCTGCT 


GTAGCT TCCA 


TGATGGCATT 


CAGTGCCGCC 


CTCACCAGTC 


CGTTGTCAAC 


5 760 


TAGCACCACT 


ATCCTTCTCA 


ACATTTTGGG 


GGGCTGGCTA 


GCATCCCAAA 


TTGCGCCTCC 


5820 


CGCGGGGGCT 


ACCGGCTTCG 


TCGTCAGTGG 


CCTGGTGGGG 


GCTGCCGTAG 


GCAGCATAGG 


5880 


Ci 1 GGGTAAG 


GTGCTGGTGG 


ACATCCTGGC 


AGGGTATGGT 


GCGGGCATTT 


CGGGGGCTCT 


5940 


PftTPfifM TTf 
V/U 1 l/UOA 1 I \j 


AAurt 1 KiR 1 U 1 


lr 1 uuourtuAA 


uvyUU 1 ULA 1 Li 


uftuuA 1 u 1 1 u 




DUW 


GCCTGGAATT 


CTGTCTCCGG 


GTGCCCTGGT 


GGTGGGAGTC 


ATCTGCGCGG 


CCATCCTGCG 


6060 


CCGACACGTG 


GGACCGGGGG 


AAGGCGCTGT 


CCAATGGATG 


AATAGGCTCA 


TTGCCTTTGC 


6120 


TTCCAGAGGA 


AACCACGTCG 


CCCCCACCCA 


CTACGTGACG 


GAGTCGGATG 


CGTCGCAGCG 


6180 


TGTGACCCAA 


CTACTTGGCT 


CCCTTACCAT 


AACCAGCCTG 


CTCAGGAGAC 


TCCACAACTG 


6240 


GATTACTGAA 


GACTGCCCCA 


TCCCATGCAG 


CGGCTCGTGG 


CTCCGCGATG 


TGTGGGATTG 


6300 


GGTTTGCACC 


ATCCTAACAG 


ACTTTAAAAA 


CTGGCTGACC 


TCCAAATTGT 


TCCCAAAGAT 


6360 


GCCTGGTCTC 


CCCTTTATCT 


CTTGTCAAAA 


GGGGTACAAG 


GGCGTGTGGG 


CTGGCACTGG 


6420 



30 
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TATCATGACC 


ACACGGTGTC 


CTTGCGGCGC 


CAATATCTCT 


GGCAATGTCC 

VJ u V/ i \ *V l vi I \y Vr 


GCCTGGGPTT 


O40u 


CATGAGAATT 


ACGGGGCCCA 

l * V/ VI VI VI X* Vr v v ■ * 


AAACCTGCAT 


GAATATCTGG 


CAGGGGACCT 

V/ 1 * VI VI V* VI* » v/ \y 1 


TTCCCATfAA 


vy J *4 \J 


TTGTTACACG 


GAGGGCCAGT 


GCGTGCCGAA 


ACCCGCACCA 


AACTTTAAGA 

fifi \y i i i i 1 1 i vi ■ i 


TCGCCATCTG 


6600 


GAGGGTGGCG 


GCCTCAGAGT 


ACGCGGAGGT 


GACGCAGCAC 


GGGTCATACC 

VI VI 1 Vr 1 1 * * » Vr V 


ACTACATAAC 


6660 

vy V/ v/ u 


AGGACTTACC 


ACTGATAACT 


TGAAAGTTCC 


TTGCCAACTA 


CCTTCTCCAG 

\J Vr 1 I Vr 1 Vr Vr f t VI 


^GTTCTTTTC 


6720 


CTGGGTGGAC 


GGAGTGCAGA 


TCCATAGGTT 


TGCCCCCATA 


CCGAAGCCGT 


TTTTTCGGGA 


6780 


TGAGGTCTCG 


TTCTGCGTTG 


GGCTTAATTC 


ATTTGTCGTC 


GGGTCTCAGC 


TCCCTTGCGA 

i V Vr Vr I I VI Vr Vi 1 \ 


6840 


TCCTGAACCT 


GACACAGACG 


TATTGACGTC 


CATGCTAACA 


GACCCATCCC 


ATATCACGGC 

1 \ 1 i * 1 V/ \f \jl Vi V^ 


6900 


GGAGACTGCA 

VI Vf 1 » VI 1 l Vr 1 VI v 1 1 


GCGCGGCGTT 

\m V VI Vr VI Vi Vf V* 1 1 


TGGCACGGGG 


GTCACCCCCG 

\J 1 V 1 i V v V v Vr Vfl 


TCCGAGGCAA 


GCTCCTCAGC 


6960 


GAGCCAGCTA 


TCGGCACCAT 


CGCTGCGAGC 


CACCTGCACC 


ACCCACGGCA 

1 * Vr Vr Vr I » V V* VI V 1 * 


AGGCCTATGA 


7020 

t U i- V 


TGTGGACATG 


GTGGATGCCA 


ACCTGTTCAT 


GGGGGGCGAT 


GTGACCCGGA 


TAGAGTCTGA 


7080 

i \J vy \j 


GTCCAAAGTG 


GTCGTTCTGG 


ACTCTCTCGA 


CCCAATGGTC 


GAAGAAAGGA 

VI i V I % VI 1 * t M * Vi VI 1 * 


GCGACCTTGA 


7140 


GCCTTCGATA 


CCATCGGAAT 


ATATGCTCCC 


CAAGAAGAGA 


TTCCCACCAG 


CCTTACCGGC 


7200 

i L. \J \J 


TTGGGCACGG 


CCTGATTACA 


ACCCACCGCT 


TGTGGAATCG 

1 V* 1 VI VI * \ I 1 I V V* 


TGGAAGAGGC 


CAGATTACCA 


7260 

i L, \J V 


ACCGGCCACT 


GTTGCGGGCT 


GCGCTCTCCC 


CCCCCCTAAG 

vy \y v/ \y \y v/ i t\ nvi 


AAAACCCCGA 


CGCCTCCCCC 

Vy \J vy vy I Vy Vy Vy Vy V 


7320 

f vJ L. U 


AAGGAGACGC 


CGGACAGTGG 


GTCTGAGTGA 


GAGCTCCATA 


GCAGATGCCC 

v>i v/ ( \ v*i ( \ < vy vy vy 


TACAACAGCT 


7^80 

f vJ V^ V 


GGCCATCAAG 


TCCTTTGGCC 


AGCCCCCCCC 


AAGCGGCGAT 


TCAGGCCTTT 


CCACGGGGGC 

VsVsMVsUVJviJvJ VJ Vy 


7440 


GGACGCAGCC 


GATTCCGGCA 


GTCGGACGCC 

v* i v vi vi i » v/ vi \/ \y 


CCCCGATGAG 


TTGGCCCTTT 

l 1 VJ VJ Vy vy Vy III 


CGGAGACAGG 

vU VJrV VJrV \jti vJVJ 


7500 


TTCCATCTCC 


TCCATGCCCC 


CTCTCGAGGG 


GGAGCCTGGA 


GATCCAGACT 

Un ( Vy Vy nU n\/ 1 


TGGAGCCTGA 

i <-i vj< \ vi Vy Vy t urv 


7560 


GCAGGTAGAG 


CTTCAACCTC 


CCCCCCAGGG 

v \s \y \f \f v/ r* vi v* v 


GGGGGTGGTA 


ACCCCCGGCT 


CAGGCTCGGG 


7620 


GICTTGGTCT 


ACTTGCTCCG 


AGGAGGACGA 

* i V* V* I \ VI VI i \ \J Vi 1 \ 


CTCCGTCGTG 

V 1 WU I \y\i I Vi 


TGCTGCTCCA 

1 \^ v 1 \J Vy 1 \y Vy iV 


TGTCATACTC 

1 vi 1 v/r\ 1 tlvr 1 V/ 


7680 

[ Vj'vy u 


CTGGACCGGG 


GCTCTAATAA 


CTCCTTGTAG 

Vr 1 V V f I VI III VI 


CCCCGAAGAG 


GAAAAGTTGC 


CAATTGGCCC 


7740 


CTTGAGCAAC 


TCCCTGTTGC 

■ Vr Vr V 1 V* 1 1 V* V/ 


GATATCACAA 


CAAGGTGTAC 

Vy r\ rA VJ \J | VJ I C\\s 


TGTACCACAT 


CAAAGAGCGr 

\s c\ C\ t\ M n \J V/ VJ Vy 


7800 


CTCATTAAGG 

w 1 Vr Til 1 I* 1 » VI w 


GCTAAAAAGG 


TAACTTTTGA 


TAGGATGCAA 




TTPATTATfiA 


r OOv 


CTCAGTCTTG 


AAGGACATTA 


AGCTAGCGGC 


CTCCAAGGTO 




TTrTTArTTT 


7^9 0 


AGAGGAGGCC 


TGCCAGTTAA 

i vi v vf *\ V-4 i i «\ rv 


CTCCACCCCA 

Vv 1 V/vnvvvvn 


CTCTGCAAGA 




GGTTTGftGGO 

UU 1 I 1 VJUuVJVs 


7Q80 

f uO\J 


1 HH UUflUU 1 \j 








AICAAGI COG 


IGiGuAAGGA 


0 A A A 

8040 


CCTCCTGGAA 


GACACACAAA 


CACCAATTCC 


TACAACCATC 


ATGGCCAAAA 


ATGAGGTGTT 


8100 


CTGCGTGGAC 


CCCACCAAGG 


GGGGTAAGAA 


AGCAGCTCGC 


CTTATCGTTT 


ACCCTGACCT 


8160 
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CGGCGTCAGG GTCTGCGAGA AAATGGCCCT TTATGATATC ACACAAAAGC TTCCTCAGGC 8220 
GGTGATGGGG GCTTCTTATG GATTCCAGTA CTCCCCCGCT CAGCGGGTGG AGTTTCTCTT 8280 
GAAGGCATGG GCGGAAAAGA AAGACCCTAT GGGTTTTTCG TATGATACCC GATGCTTTGA 8340 
CTCAACCGTC ACTGAGAGAG ACATCAGGAC TGAGGAGTCC ATATATCGGG CTTGTTCCTT 8400 
GCCCGAGGAG GCCCACACTG CCATACACTC ACTGACTGAG AGACTTTACG TGGGAGGGCC 8460 
CATGTTCAAC AGCAAGGGCC AGACCTGCGG GTACAGGCGT TGCCGCGCCA GCGGGGTGCT 8520 
TACCACTAGC ATGGGGAACA CCATCACATG CTATGTGAAA GCCTTAGCGG CCTGTAAGGC 8580 
TGCAGGGATA ATTGCGCCCA CAATGCTGGT ATGCGGCGAT GACTTGGTTG TCATCTCAGA 8640 
GAGCCAGGGG ACCGAGGAGG ACGAGCGGAA CCTGAGAGCC TTCACGGAGG CTATGACCAG 8700 
GTATTCTGCC CCTCCTGGTG ACCCCCCCAG ACCGGAATAT GACCTGGAGC TGATAACATC 8760 
TTGCTCCTCA AATGTGTCTG TGGCGTTGGG CCCACAAGGC CGCCGCAGAT ACTACCTGAC 8820 
CAGAGACCCT ACCACICCAA TCGCCCGGGC TGCCTGGGAA ACAGTTAGAC ACTCCCCTGT 8880 
CAATTCATGG CTAGGAAACA TCATCCAGTA CGCCCCAACC ATATGGGCTC GCATGGTCCT 8940 
GATGACACAC TTCTTCTCCA TTCTCATGGC CCAAGATACT CTGGACCAGA ACCTCAACTT 9000 
TGAGATGTAC GGAGCGGTGT ACTCCGTGAG TCCCTTGGAC CTCCCAGCCA TAATTGAAAG 9060 
GTTACACGGG CTTGACGCTT TCTCTCTGCA CACATACACT CCCCACGAAC TGACACGGGT 9120 
GGCTTCAGCC CTCAGAAAAC TTGGGGCGCC ACCCCTCAGA GCGTGGAAGA GCCGGGCACG 9180 
TGCAGTCAGG GCGTCCCTCA TCTCCCGTGG GGGGAGAGCG GCCGTTTGCG GCCGATATCT 9240 
CTTCAACTGG GCGGTGAAGA CCAAGCTCAA ACTCACTCCA TTGCCGGAAG CGCGCCTCCT 9300 
GGATTTATCC AGCTGGTTCA CTGTCGGCGC CGGCGGGGGC GACATTTATC ACAGCGTGTC 9360 
GCGTGCCCGA CCCCGCTTAT TACTCCTTGG CCTACTCCTA CTTTTTGTAG GGGTAGGCCT 9420 
TTTCCTACTC CCCGCTCGGT AGAGCGGCAC ACATTAGCTA CACTCCATAG CTAACTGTCC 9480 
CTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT 9540 
TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTT 9589 
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Sequence ID No. 3 
Sequence Length: 3.970 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



GGCATTACCC CTGCACAGTT AACTATACCA 
TCGAGCACAG GCTCACGGCT GCGTGCAATT 
ACAGAGACAG AAGTCAACTG TCTCCTTTGC 
CTTGCACTTA CTCGGACCTG CCCGCCTTGT 
TCGTGGACGT GCAATTCATG TATGGCCTAT 
GGGAGTGGGT AGTACTCTTA TTCCTGCTCT 
GGATGCTCAT CTTGTTGGGC CAGGCCGAAG 
CTGCGAGCGC AGCTAGCTGC AATGGCTTCC 
GGTACATCAA GGGTCGGGTA GTCCCCTTGG 
TTGGCCTACT GCTCCTAGCA TTGCCCCAAC 
GTCAGATAGG AGCAGCTCTG TTGGTACTGA 
AGACCCTTCT CAGCCGGTTT CTGTGGTGGT 
TGGTCCAGGA GTGGGCACCA CCTATGCAGG 
CCGTCGCCAT ATTCTGCCCG GGTGTGGTGT 
TTGGGCCTGC TTATCTCCTA AAAGGTGCTT 
ACGCTCTACT AAGGATGTGC ACCATGGTAA 
TGGTGCTACT AGCCCTTGGC AGGTGGACTG 
TGTCGGATTG GGCTGCTAAT GGCCTGCGGG 
TCAGTCCGAT GGAGAAAAAA GTCATCGTCT 
TCTTACACGG ACTTCCCGTG TCCGCCCGAC 



TCTTCAAAAT 


AAGGATGTAT 


GTGGGGGGGG 


60 


TCACTCGTGG 


GGATCGTTGC 


AACTTGGAGG 


120 


TGCACTCCAC 


CACGGAGTGG 


GCCATTTTAC 


180 


f^Af TWITT 
OuAL I uu I K, I 


I 1/ 1 U^rtOU 1 l/- 




9 AO 


CACCTGCTCT 


CACAAAATAC 


ATCGTCCGAT 


300 


TAGCGGACGC 


CAGGGTTTGC 


GCCTGCTTAT 


360 


CAGCACTAGA 


GAAGTTGGTC 


GTCTTGCACG 


420 


TATACTTTGT 


CATCTTTTTC 


GTGGCTGCTT 


480 


CTACTTATTC 


CCTCACTGGC 


CTATGGTCCT 


540 


AGGCTTATGC 


TTATGACGCA 


TCTGTACATG 


600 


TCACTCTCTT 


TACACTCACC 


CCCGGGTATA 


660 


TGTGCTATCT 


TCTGACCCTG 


GCGGAAGCTA 


720 


TGCGCGGTGG 


CCGTGATGGG 


ATCATATGGG 


780 


TTGACATAAC 


CAAGTGGCTC 


TTGGCGGTGC 


840 


TGACGCGTGT 


GCCGTACTTC 


GTCAGGGCTC 


900 


GGCATCTCGC 


GGGGGGTAGG 


TACGTCCAGA 


960 


GCACTTACAT 


CTATGACCAC 


CTCACCCCTA 


1020 


ACTTGGCGGT 


CGCCGTGGAG 


CCTATCATCT 


1080 


GGGGAGCGGA 


GACAGCTGCT 


TGCGGGGATA 


1140 


TTGGCCGGGA 


GGTCCTCCTT 


GGCCCAGCTG 


1200 
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ATGGCTATAC CTCCAAGGGG TGGAGTCTTC TCGCCCCCAT CACTGCTTAT GCCCAGCAGA 1260 
CACGCGGCCT TTTGGGCACC ATAGTGGTGA GCATGACGGG GCGCGACAAG ACAGAACAGG 1320 
CCGGGGAGAT TCAGGTCCTG TCCACGGTCA CTCAGTCCTT CCTCGGAACA ACCATCTCGG 1380 
GGGTCTTATG GACTGTCTAC CATGGAGCTG GCAACAAGAC TCTAGCCGGC TCACGGGGTC 1440 
CGGTCACACA GATGTACTCC AGTGCTGAGG GGGACTTAGT GGGGTGGCCC AGCCCCCCCG 1500 
GGACCAAATC TTTGGAGCCG TGCACGTGTG GAGCGGTCGA CCTATACCTG GTCACGCGAA 1560 
ACGCTGATGT CATCCCGGCT CGAAGACGCG GGGACAAGCG AGGAGCGCTA CTCTCCCCGA 1620 
GACCTCTTTC CACCITGAAG GGGTCCTCGG GGGGCCCGGT GCTCTGCCCC AGAGGCCACG 1680 
CTGTCGGGGT CTTCCGGGCA GCCGTGTGCT CCCGGGGCGT GGCCAAGTCC ATAGATTTTA 1740 
TCCCCGTTGA GACACTTGAC ATCGTCACTC GGTCCCCCAC CTTTAGTGAC AACAGCACAC 1800 
CACCTGCTGT GCCCCAAACT TATCAGGTCG GGTACTTACA TGCCCCGACT GGTAGTGGAA 1860 
AGAGCACCAA AGTCCCTGTC GCGTATGCCG CTCAGGGGTA CAAAGTGCTA GTGCTTAATC 1920 
CCTCGGTGGC TGCCACCCTG GGGTTTGGGG CGTACTTGTC CAAGGCACAT GGCATCAATC 1980 
CCAACATTAG GACTGGGGTC AGGACTGTGA CGACCGGGGC GCCCATCACG TACTCCACAT 2040 
ATGGCAAATT CCTCGCCGAT GGGGGCTGCG CAGGCGGCGC CTATGACATC ATCATATGCG 2100 
ATGAATGCCA TGCCGTGGAC TCTACCACCA TTCTCGGCAT CGGAACAGTC CTCGATCAAG 2160 
CAGAGACAGC CGGGGTCAGG CTAACTGTAC TGGCTACGGC TACGCCCCCC GGGTCAGTGA 2220 
CAACCCCCCA CCCCAACATA GAGGAGGTGG CCCTCGGGCA GGAGGGTGAG ATCCCCTTCT 2280 
ATGGGAGGGC GATTCCCCTG TCATACATCA AGGGAGGAAG ACACTTGATC TTCTGCCACT 2340 
OAAAGAAAAA GTGTGACGAG CTCGCGGCGG CCCTTCGGGG TATGGGCTTG AACGCAGTGG 2400 
CATACTACAG AGGGCTGGAC GTCTCCGTAA TACCAACTCA GGGAGACGTA GTGGTCGTCG 2460 
CCACCGACGC CCTCATGACG GGGTTTACTG GAGACTTTGA CTCCGTGATC GACTGCAACG 2520 
TAGCGGTCAC TCAAGTTGTA GACTTCAGCT TGGACCCCAC ATTCACCATA ACCACACAGA 2580 
CTGTCCCTCA AGACGCTGTC TCACGTAGCC AGCGCCGGGG CCGCACGGGC AGGGGAAGAC 2640 
TGGGTATTTA TAGGTATGTT TCCACTGGTG AGCGAGCCTC AGGAATGTTT GACAGTGTAG 2700 
TGCTCTGCGA GTGCTACGAT GCAGGGGCCG CATGGTATGA GCTCACACCA GCGGAGACCA 2760 
CCGTCAGGCT CAGAGCATAT TTCAACACAC CTGGTTTGCC TGTGTGCCAA GACCATCTTG 2820 
AGTTTTGGGA GCAGTTTTC ACCGGCCTCA CACACATAGA TGCCCACTTC CTTTCCCAAA 2880 
CAAAGCAATC GGGGGAAAAT TTCGCATACT TAACAGCCTA CCAGGCTACA GTGTGCGCTA 2940 
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GGGCCAAAGC CCCCCCCCCG TCCTGGGACG 
CCACACTCGT GGGCCCCACA CCTCTCCTGT 
CCCTCACGCA TCCTGTGACG AAATACATCG 
TGACCAGCAC GTGGGTCTTA GCTGGGGGGG 
CGACCGGGTG TGTTTGCATC ATCGGCCGCT 
CGGACAAGGA GGTCCTCTAT GAGGCTTTTG 
CTCTCATTGA AGAGGGGCAG CGGATAGCCG 
TGCAGCAAGC TTCCAAACAA GCTCAAGACA 
AGGTAGAGCA ATTCTGGGCC AAACACATGT 
CAGGACTATC AACACTGCCA GGGAACCCTG 
CCCTCACCAG TCCGTTGTCA ACTAGCACCA 
TAGCATCCCA AATTGCGCCT CCCGCGGGGG 
GGGCTGCCGT AGGCAGCATA GGCTTGGGTA 
GTGCGGGCAT TTCGGGGGCT CTCGTCGCAT 
TGGAGGATGT TGTCAACCTG CTGCCTGGAA 
TCATCTGCGC GGCCATCCTG CGCCGACACG 
TGAATAGGCT CATTGCCTTT GCTTCCAGAG 
CGGAGTCGGA 3970 



TCATGTGGAA GTGTTTGACT CGACTCAAGC 3000 
ACCGCTTGGG CTCTGTTACC AACGAGGTCA 3060 
CCACCTGCAT GCAAGCCGAC CTTGAGGTCA 3120 
TCTTGGCGGC CGTCGCCGCG TACTGCCTGG 3180 
TGCACGTTAA CCAGCGAGCC GTCGTTGCAC 3240 
ATGAGATGGA GGAATGTGCC TCTAGAGCGG 3300 
AGATGCTGAA GTCCAAGATC CAAGGCTTAT 3360 
TACAACCCGC TGTGCAGGCT TCTTGGCCCA 3420 
GGAACTTCAT CAGCGGCATT CAATACCTCG 3480 
CTGTAGCTTC CATGATGGCA TTCAGTGCCG 3540 
CTATCCTTCT CAACATTTTG GGGGGCTGGC 3600 
CTACCGGCTT CGTCGTCAGT GGCCTGGTGG 3660 
AGGTGCTGGT GGACATCCTG GCAGGGTATG 3720 
TCAAGATCAT GTCTGGCGAG AAGCCCTCCA 3780 
TTCTGTCTCC GGGTGCCCTG GTGGTGGGAG 3840 
TGGGACCGGG GGAAGGCGCT GTCCAATGGA 3900 
GAAACCACGT CGCCCCCACC CACTACGTGA 3960 
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Sequence ID No. 4 
Sequence Length: 2,693 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



ATTCTGTCTC 


CGGGTGCCCT 


GGTGGTGGGA 


GTCATCTGCG 


CGGCCATCCT 


GCGCCGACAC 


60 


GTGGGACCGG 


GGGAAGGCGC 


TGTCCAATGG 


ATGAATAGGC 


TCATTGCCTT 


TGCTTCCAGA 


120 


GGAAACCACG 


TCGCCCCCAC 


CCACTACGTG 


ACGGAGTCGG 


ATGCGTCGCA 


GCGTGTGACC 


180 


CAACTACTTG 


GCTCCCTTAC 


CATAACCAGC 


CTGCTCAGGA 


GACTCCACAA 


CTGGATTACT 


240 


GAAGACTGCC 


CCAICCCAFG 


CAGCGGCTCG 


IGGCTCCGCG 


ATGTGTGGGA 


TTGGGTTTGC 


300 


ACCATCCTAA 


CAGACTTTAA 


AAACTGGCTG 


ACCTCCAAAT 


TGTTCCCAAA 


GATGCCTGGT 


360 


CTCCCCTTTA 


TCTCTTGTCA 


AAAGGGGTAC 


AAGGGCGTGT 


GGGCTGGCAC 


TGGTATCATG 


420 


ACCACACGGT 


GTCCTTGCGG 


CGCCAATATC 


TCTGGCAATG 


TCCGCCTGGG 


CTCCATGAGA 


480 


ATTACGGGGC 


CCAAAACCTG 


CATGAATATC 


TGGCAGGGGA 


CCTTTCCCAT 


CAATTGTTAC 


540 


ACGGAGGGCC 


AGTGCGTGCC 


GAAACCCGCA 


CCAAACTTTA 


AGATCGCCAT 


CTGGAGGGTG 


600 


'1CGGCCTCAG 


AGTACGCGGA 


GGTGACGCAG 


CACGGGTCAT 


ACCACTACAT 


AACAGGACTT 


660 


ACCACTGATA 


ACTTGAAAGT 


TCCTTGCCAA 


CTACCTTCTC 


CAGAGTTCTT 


TTCCTGGGTG 


720 


GACGGAGTGC 


AGATCCATAG 


GTTTGCCCCC 


ATACCGAAGC 


CGTTTTTTCG 


GGATGAGGTC 


780 


TCGTTCTGCG 


TTGGGCTTAA 


TTCATTTGTC 


GTCGGGTCTC 


AGCTCCCTTG 


CGATCCTGAA 


840 


CCTGACACAG 


ACGTATTGAC 


GTCCATGCTA 


ACAGACCCAT 


CCCATATCAC 


GGCGGAGACT 


900 


GCAGCGCGGC 


GTTTGGCACG 


GGGGTCACCC 


CCGTCCGAGG 


CAAGCTCCTC 


AGCGAGCCAG 


960 


CTATCGGCAC 


CATCGCTGCG 


AGCCACCTGC 


ACCACCCACG 


GCAAGGCCTA 


TGATGTGGAC 


1020 


ATGGTGGATG 


CCAACCTGTT 


CATGGGGGGC 


GATGTGACCC 


GGATAGAGTC 


TGAGTCCAAA 


1080 


GTGGTCGTTC 


TGGACTCTCT 


CGACCCAATG 


GTCGAAGAAA 


GGAGCGACCT 


TGAGCCTTCG 


1 140 


ATACCATCGG 


AATATATGCT 


CCCCAAGAAG 


AGATTCCCAC 


CAGCCTTACC 


GGCTTGGGCA 


1200 
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CGGCCTGATT ACAACCCACC GCTTGTGGAA TCGTGGAAGA GGCCAGATTA CCAACCGGCC 1260 
ACTGTTGCGG GCTGCGCTCT CCCCCCCCCT AAGAAAACCC CGACGCCTCC CCCAAGGAGA 1320 
CGCCGGACAG TGGGTCTGAG TGAGAGCTCC ATAGCAGATG CCCTACAACA GCTGGCCATC 1380 
AAGTCCTTTG GCCAGCCCCC CCCAAGCGGC GATTCAGGCC TTTCCACGGG GGCGGACGCA 1440 
GCCGATTCCG GCAGTCGGAC GCCCCCCGAT GAGTTGGCCC TTTCGGAGAC AGGTTCCATC 1500 
TCCTCCATGC CCCCTCTCGA GGGGGAGCCT GGAGATCCAG ACTTGGAGCC TGAGCAGGTA 1560 
GAGCTTCAAC CTCCCCCCCA GGGGGGGGTG GTAACCCCCG GCTCAGGCTC GGGGTCTTGG 1620 
TCTACTTGCT CCGAGGAGGA CGACTCCGTC GTGTGCTGCT CCATGTCATA CTCCTGGACC 1680 
GGGGCTCTAA TAACTCCTTG TAGCCCCGAA GAGGAAAAGT TGCCAATTGG CCCCTTGAGC 1740 
AACTCCCTGT TGCGATATCA CAACAAGGTG TACTGTACCA CATCAAAGAG CGCCTCATTA 1800 
AGGGCTAAAA AGGTAACTTT TGATAGGATG CAAGCGCTCG ACGCTCATTA TGACTCAGTC 1860 
TTGAAGGACA TTAAGCTAGC GGCCTCCAAG GTCACCGCAA GGCTTCTCAC TTTAGAGGAG 1920 
GCCTGCCAGT TAACTCCACC CCACTCTGCA AGATCCAAGT ATGGGTTTGG GGCTAAGGAG 1980 
GTCCGCAGCT TGTCCGGGAG AGCCGTTAAC CACATCAAGT CCGTGTGGAA GGACCTCCTG 2040 
GAAGACACAC AAACACCAAT TCCTACAACC ATCATGGCCA AAAATGAGGT GTTCTGCGTG 2100 
GACCCCACCA AGGGGGGTAA GAAAGCAGCT CGCCTTATCG TTTACCCTGA CCTCGGCGTC 2160 
AGGGTCTGCG AGAAAATGGC CCTTTATGAT ATCACACAAA AGCTTCCTCA GGCGGTGATG 2220 
GGGGCTTCIT ATGGATTCCA GTACTCCCCC GCTCAGCGGG TGGAGTTTCT CTTGAAGGCA 2280 
TGGGCGGAAA AGAAAGACCC TATGGGTTTT TCGTATGATA CCCGATGCTT TGACTCAACC 2340 
GTCACTGAGA GAGACATCAG GACTGAGGAG TCCATATATC GGGCTTGTTC CTTGCCCGAG 2400 
GAGGCCCACA CTGCCATACA CTCACTGACT GAGAGACTTT ACGTGGGAGG GCCCATGTTC 2460 
AACAGCAAGG GCCAGAGCTG CGGGTACAGG CGTTGCCGCG CCAGCGGGGT GCTTACCACT 2520 
AGCATGGGGA ACACCATCAC ATGCTATGTG AAAGCCTTAG CGGCCTGTAA GGCTGCAGGG 2580 
ATAATTGCGC CCACAATGCT GGTATGCGGC GATGACTTGG TTGTCATCTC AGAGAGCCAG 2640 
GGGACCGAGG AGGACGAGCG GAACCTGAGA GCCTTCACGG AGGCTATGAC CAG 2693 
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Sequence ID No. 5 
Sequence Length: 3,033 
Sequence Type: amino acid 
Topology: linear 
Molecule Type: protein 



Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 

5 10 15 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gfy Gin He 

20 25 30 

Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 

35 40 45 

Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin r>ro Arg Gly 

50 55 60 

Arg Arg Gin Pro i le Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser 

65 70 75 

Trp Gly Lys Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 

30 85 90 

' *u Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro 

95 100 105 

Ser Trp Gly Pro Asn Asp Pro Arg His Arg Ser Arg Asn Val Gly 

110 115 120 

Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asd Leu Het Gly 

125 130 135 

Tyr He Pro Val Val Gly Ala Pro Leu Gly Gly Val Ala Arg Ala 

140 145 150 

Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Phe Ala 

155 160 165 
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Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala 

1 70 1 75 180 
Leu Leu Ser Cys He Thr Thr Pro Val Ser Ala Ala Glu Val Lys 

135 190 195 

Asn He Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr Asn Asp 

200 205 210 

Ser He Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro Gly 

215 220 225 

Cys Val Pro Cys Glu Lys Val Gly Asn Thr Ser Arg Cys Irp He 

230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Gin Pro Gly Ala Leu Thr 

245 250 255 

Gin Gly Leu Arg Thr His He Asp Met Val Val Met Ser Ala Thr 

260 265 270 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Met 

2 75 280 285 
Leu Ala Ala Gin Het Phe He Val Ser Pro Gin His His Trp Phe 

290 295 300 

Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly Thr He Thr Gly 

305 310 315 

His Arg Het Ala Trp Asp Het Het Het Asn Trp Ser Pro Thr Ala 

320 325 330 

Thr Het lie Leu Ala Tyr Ala Het Arg Val Pro Glu Val lie He 

335 340 345 

Asp He He Gly Gly Ala His Trp Gly Val Het Phe Gly Leu Ala 

350 355 360 

Tyr Phe Ser Het Gin Gly Ala Trp Ala Lys Val Val Val He Leu 

365 370 375 

Leu Leu Ala Ala Gly Val Asp Ala Gin Thr His Thr Val Gly Gly 
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3S0 



385 



390 



Ser Thr Ala His Asn Ala Arg Thr Leu Thr Gly Her Phe Ser Leu 

395 400 405 

Gly Ala Arg Gin Lys lie Gin Leu He Asn Thr Asn Gly Ser Trp 

410 415 420 

His He Asn Arg Ihr Ala Leu Asn Cys Asn Asp Ser Leu His Thr 

425 430 435 

Gly Phe Leu Ala Ser Leu Phe Tyr Thr His Ser Phe Asn Ser Ser 

440 445 450 

Gly Cys Pro Glu Arg Het Ser Ala Cys Arg Ser lie Glu Ala Phe 

455 460 465 

Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 

470 475 480 

Pro Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin 

485 490 495 

Cys Gly Val Val Ser Ala Ser Ser Val Cys Gly Pro Val Tyr Cys 

500 505 510 

Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly 

515 520 525 

Ala Pro Thr Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu 

530 535 540 

Leu Asn Ser Thr Arg Pro Pro Gin Gly Ser Trp Phe Gly Cys Thr 

545 550 555 

Trp Met Asn Ser Thr Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro 
560 565 570 

Cys Arg He Arg Ala Asp Phe Asn Ala Ser Het Asp Leu Leu Cys 
575 580 585 

Pro Thr Asp Cys Phe Arg Lys His Pro Asp Thr Thr Tyr He Lys 



590 



595 



600 
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Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Leu He Asp Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Tyr Thr He 

620 625 630 

Phe Lys He Arg Mec Tyr Val Gly Gly Val Glu His Arg Leu Thr 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Asn Leu Glu Asp 

650 655 660 

Arg Asp Arg Ser Gin Leu Ser Pro leu Leu His Ser Thr Thr Glu 

665 670 675 

Trp Ala He Leu Pro Cys Thr Tyr Ser Asp Leu Pro Ala Leu Ser 

680 685 690 

Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin Phe 

695 700 705 

Het Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr He Val Arg Trp 

710 715 720 

Glu Trp Val Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val 

725 730 735 

Cys Ala Cys Leu Trp Het Leu He Leu Leu Gly Gin Ala Glu Ala 

740 745 750 

Ala Leu Glu Lys Leu Val Val Leu His Ala Ala Ser Ala Ala Ser 

755 760 765 

Cys Asn Gly Phe Leu Tyr Phe Val He Phe Phe Val Ala Ala Trp 

770 775 780 

Tyr He Lys Gly Arg Val Val Pro Leu Ala Thr Tyr Ser Leu Thr 

785 790 795 

Gly Leu Trp Ser Phe Gly Leu Leu Leu Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Tyr Asp Ala Ser Val His Gly Gin He Gly Ala Ala 
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815 



320 



leu Leu Val Leu lie Thr Leu Phe Thr Leu Thr Pro Gly Tyr Lys 

830 335 840 

Thr Leu Leu Ser Arg Phe Leu Trp Trp Leu Cys Tyr Leu Leu Thr 

845 850 855 

Leu Ala Glu Ala Met Val Gin Glu Trp Ala Pro Pro Met Gin Val 

860 865 870 

Arg Gly Gly Arg Asp Gly He He Trp Ala Val Ala He Phe Cys 

875 880 885 

Pro Gly Val Val Phe Asp lie Thr Lys Trp Leu Leu Ala Val Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Lys Gly Ala Leu Thr Arg Val Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Met Cys Thr Met Vai Arg 

920 925 930 

His Leu Ala Gly Gly Arg Tyr Val Gin Het Val Leu Leu Ala Leu 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr lie Tyr Asp Kis Leu Thr Pro Het 

950 955 960 

Ser Asp Trp Ala Ala Asn Gly Leu Arg Asp Leu Ala Val Ala Val 

965 970 975 

Glu Pro He He Phe Ser Pro Het Glu Lys Lys Val He Val Trp 

980 985 990 

Gly Ala Glu Thr Ala Ala Cys Gly Asp lie Leu His Gly Leu Pro 

995 1000 1005 

Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 

Gly Tyr Thr Ser Lys Gly Trp Ser Leu Leu Ala Pro He Thr Ala 



1025 



1030 



1035 
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Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly Thr He Val Val Ser 
1040 1045 1050 

Met Thr Gly Arg Asp Lys Thr Glu Gin Ala Gly Glu He Glu Val 
1055 1060 1065 

Leu Ser Thr Val Thr Gin Ser Phe Leu Gly Thr Thr He Ser Gly 
1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 
1085 1090 1095 

Gly Ser Arg Gly Pro Val Thr Gin Met Tyr Ser Ser Ala Glu Gly 
1100 1105 1110 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Glu 
1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn 
1130 1135 1140 

Ala Asp Val He Pro Ala Arg Arg Arg Gly Asp Lys Arg Gly Ala 
1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 
1160 1165 1170 

Gly Pro Val Leu Cys Pro Arg Gly His Ala Val Gly Val Phe Arg 
1175 1180 1185 

Ala Ala Val Cys Ser Arg Gly Val Ala Lys Ser lie Asp Phe He 
1190 1195 1200 

Pro Val Glu Thr Leu Asp He Val Thr Arg Ser Pro Thr Phe Ser 
1205 1210 1215 

Asp Asn Ser Thr Pro Pro Ala Val Pro Gin Thr Tyr Gin Val Gin 
1220 1225 1230 

Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 
1235 1240 1245 

Val Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
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1255 



1260 



Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Leu Ser Lys Ala 

1265 1270 1275 

His Gly lie Asn Pro Asn He Arg Thr Gly Val Arg Thr Val Thr 

1280 1285 1290 

Thr Gly Ala Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala 

1295 1300 1305 

Asp Gly Gly Cys Ala Gly Gly Ala Tyr Asp He lie He Cys Asp 

1310 1315 1320 

Glu Cys His Ala Val Asp Ser Thr Thr He Leu Gly He Gly Thr 

1325 1330 1335 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu Thr Val Leu 

1340 1345 1350 

Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Thr Pro His Pro Asn 

1355 1360 1365 

lie Glu Glu Val Ala Leu Gly Gin Glu Gly Glu He Pro Phe Tyr 

1370 1375 1380 

Gly Arg Ala lie Pro Leu Ser Tyr He Lys Gly Gly Arg His Leu 

1385 1390 1395 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Ala 

1400 1405 1410 

Leu Arg Gly Met Gly Leu Asn Ala Val Ala Tyr Tyr Arg Gly Leu 

1415 1420 1425 

Asp Val Ser Val lie Pro Thr Gin Gly Asp Val Val Val Val Ala 

1430 1435 1440 

Thr Asp Ala Leu Het Thr Gly Phe Thr Gly Asp Phe Asp Ser Val 

1445 1450 1455 

He Asp Cys Asn Val Ala Val Thr Gin Val Val Asp Phe Ser Leu 



1460 



1465 



1470 
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Asp Pro Thr Phe Thr He Thr Thr Gin Thr Val Pro Gin Asp Ala 
1475 1480 1485 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Leu 
1490 1495 1500 

Gly He Tyr Arg Tyr Val Ser Thr Gly Glu Arg Ala Ser Gly Met 
1505 1510 1515 

Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala Gly Ala Ala 
1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 
1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 
1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asp Ala His 
1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Phe Ala Tyr Leu 
1580 1585 1590 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 
1595 1600 1605 

Pro Ser Trp Asp Val Met Trp Lys Cys Leu Thr Arg Leu Lys Pro 
1610 1615 1620 

Trp Leu Val Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ser Val 
1625 1630 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr lie Ala 
1640 1645 1650 

Thr Cys Met Gin Ala Asp Leu Glu Val Met Thr Ser Thr Trp Val 
1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 
1670 1675 1680 

Thr Gly Cys Val Cys He He Gly Arg Leu His Val Asn Gin Arg 
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1685 



1690 



1695 



Ala Val Val Ala Pro Asp Lys Glu Val Leu Tyr Glu Ala Phe Asp 
1700 1705 1710 

Glu Her Glu Glu Cys Ala Ser Arg Ala Ala Leu He Glu Glu Gly 
1715 1720 1725 

Gin Arg He Ala Glu Met Leu Lys Ser Lys He Gin Gly Leu Leu 
1730 1735 1740 

Gin Gin Ala Ser Lys Gin Ala Gin Asp He Gin Pro Ala Val Gin 
1745 1750 1755 

Ala Ser Trp Pro Lys Val Glu Gin Phe Trp Ala Lys His Met Trp 
1760 1765 1770 

Asn Phe He Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu 
1775 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Met Het Ala Phe Ser Ala Ala 
1790 1795 1800 

Leu Thr Ser Pro Leu Ser Thr Ser Thr Thr He Leu Leu Asn He 
1805 1810 1815 

Leu Gly Gly "Hp Leu Ala Ser Gin He Ala Pro Pro Ala Gly Ala 
1820 1825 1830 

Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 
1835 1840 1845 

He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly 
1850 1855 1860 

Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys He Het Ser Gly 
1865 1870 1875 

Glu Lys Pro Ser Het Glu Asp Val Val Asn Leu Leu Pro Gly lie 
1880 1885 1890 

Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala He 



1895 



1900 



1905 
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Leu Arg Arg His Va I Gly Pro Gly Glu Gly Ala Val Gin Trp Met 

1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Thr Glu Ser Asp Ala Ser Gin Arg Val Thr Gin 

1940 1 945 1950 

Leu Leu Gly Ser Leu Thr He Thr Ser Leu Leu Arg Arg Leu His 

1955 1960 1965 

Asn Trp He Thr Glu Asp Cys Pro He Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Arg Asp Val Trp Asp Trp Val Cys Thr He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Thr Ser Lys Leu Phe Pro Lys Met Pro Gly Leu 

2000 2005 2010 

Pro Phe lie Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly He Het Thr Thr Arg Cys Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly Asn Val Arg Leu Gly Ser Het Arg He Thr Gly Pro Lys Thr 

2045 2050 2055 

Cys Het Asn He Trp Gin Gly Thr Phe Pro He Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Gin Cys Val Pro Lys Pro Ala Pro Asn Phe Lys He Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Ala Glu Val Thr Gin His 

2090 2095 2100 

Gly Ser Tyr His Tyr He Thr Gly Leu Thr Thr Asp Asn Leu Lys 

2105 2110 2115 

Val Pro Cys Gin Leu Pro Ser Pro Glu Phe Phe Ser Trp Val Asp 
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2120 



2125 



2130 



Gly Val Gin lie His Arg Phe Ala Pro He Pro lys Pro Phe Phe 

2135 2140 2145 

Arg Asp Glu Val Ser Phe Cys Val Gly Leu Asn Ser Phe Val Val 

2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Asp Val Leu 

2165 2170 2175 

Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Thr Ala 

2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Glu Ala Ser Ser 

2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Arg Ala Thr Cys Thr 

2210 2215 2220 

Thr His Gly Lys Ala Tyr Asp Val Asp Met Val Asp Ala Asn Leu 

2225 2230 2235 

Phe Hei Gly Gly Asp Val Thr Arg He Glu Ser Glu Ser Lys Val 

2240 2245 2250 

Val Val Leu Asp Ser Leu Asp Pro Hec Val Glu Glu Arg Ser Asp 

2255 2260 2265 

! eu Glu Pro Ser He Pro Ser Glu Tyr Met Leu Pro Lys Lys Arg 

2270 2275 2280 

Phe Pro Pro Ala Leu Pro Ala Trp Ala Arg Pro Asp Tyr Asn Pro 

2285 2290 2295 

Pro Leu Val Glu Ser Trp Lys Arg Pro Asp Tyr Gin Pro Ala Thr 

2300 2305 2310 

Val Ala Gly Cys Ala Leu Pro Pro Pro Lys Lys Thr Pro Thr Pro 

2315 2320 2325 

Pro Pro Arg Arg Arg Arg Thr Val Gly Leu Ser Glu Ser Ser He 



2330 



2335 



2340 
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Ala Asp Ala Leu Gin Gin Leu Ala He Lys Ser Phe Gly Gin Pro 
2345 2350 2355 

Pro Pro Ser Gly Asp Ser Gly Leu Ser Thr Gly Ala Asp Ala Ala 
2360 2365 2370 

Asp Ser Gly Ser Arg Thr Pro Pro Asp Glu Leu Ala Leu Ser Glu 
2375 2380 2385 

Thr Gly Ser lie Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 
2390 2395 2400 

Asp Pro Asp Leu Glu Pro Glu Gin Val Glu Leu Gin Pro Pro Pro 
2405 2410 2415 

Gin Gly Gly Val Val Thr Pro Gly Ser Gly Ser Gly Ser Trp Ser 
2420 2425 2430 

Thr Cys Ser Glu Glu Asp Asp Ser Val Val Cys Cys Ser Met Ser 
2435 2440 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Ser Pro Glu Glu 
2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Leu Arg Tyr 
2465 2470 2475 

His Asn Lys Val Tyr Cys Thr Thr Ser Lys Ser Ala Ser Leu Arg 
2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg Met Gin Ala Leu Asp Ala His 
2495 2500 2505 

Tyr Asp Ser Val Leu Lys Asp He Lys Leu Ala Ala Ser Lys Val 
2510 2515 2520 

Thr Ala Arg Leu Leu Thr Leu Glu Glu Ala Cys Gin Leu Thr Pro 
2525 2530 2535 

Pro His Ser Ala Arg Ser Lys Tyr Gly Phe Gly Ala Lys Glu Val 
2540 2545 2550 
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Arg Ser Leu Sen Gly Arg Ala Val Asn His He Lys Ser Val Trp 

2555 2560 2565 

Lys Asp Leu Leu G I u Asp Thr Gin Thr Pro He Pro Thr Thr He 

2570 2575 2580 

Het Ala Lys Asn Glu Val Phe Cys Val Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Ala Ala Arg Leu He Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 2610 

Val Cys Glu Lys Het Ala Leu Tyr Asp lie Thr Gin Lys Leu Pro 

2615 2620 2625 

Gin Ala Val Het Gly Ala Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Gin Arg Val Glu Phe Leu Leu Lys Ala Trp Ala Glu Lys Lys Asp 

2645 2650 2655 

Pro Het Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp He Arg Thr Glu Glu Ser lie Tyr Arg Ala Cys 

2675 2680 2685 

Ser Leu Pro Glu Glu Ala His Thr Ala He His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Het Phe Asn Ser Lys Gly Gin Thr 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser 

2720 2725 2730 

Het Gly Asn Thr He Thr Cys Tyr Val Lys Ala Leu Ala Ala Cys 

2735 2740 2745 

Lys Ala Ala Gly He He Ala Pro Thr Het Leu Val Cys Gly Asp 



2750 



2755 



2760 
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Asp Leu Val Val He Ser Glu Ser Gin Gly Thr Glu Glu Asp Glu 

2765 2770 2775 
Arg Asn Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 

2780 2785 2790 
Pro Pro Gly Asp Pro Pro Arg Pro Glu Tyr Asp Leu Glu Leu He 

2795 2800 2805 
Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Gly Pro Gin Gly 

2810 2815 2820 

Arg Arg Arg Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro lie Ala 

2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 

2840 2845 2850 

Leu Gly Asn He He Gin Tyr Ala Pro Thr He Trp Ala Arg Het 

2855 2860 2865 

Val Leu Het Thr His Phe Phe Ser He Leu Het Ala Gin Asp Thr 

2870 2875 2880 

Leu Asp Gin Asn Leu Asn Phe Glu Het Tyr Gly Ala Val Tyr Ser 

2885 2890 2895 

Val Ser Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly 

2900 2905 2910 

Leu Asp Ala Phe Ser Leu His Thr Tyr Thr Pro His Glu Leu Thr 

2915 2920 2925 

Arg Val Ala Ser Ala Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 

2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ser 

2945 2950 2955 

Arg Gly Gly Arg Ala Ala Val Cys Gly Arg Tyr Leu Phe Asn Trp 

2960 2965 2970 
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Ala Val Lys Thr Lys Leu 
2975 

Leu Leu Asp Leu Ser Ser 
2990 

Asp lie Tyr His Ser Val 
3005 

Leu Gly Leu Leu Leu Leu 
3020 

Pro Ala Arg 
3033 



Lys Leu Thr Pro Leu Pro Glu Ala Arg 

2980 2985 

Trp Phe Thr Val Gly Ala Gly Gly Gly 

2995 3000 

Ser Arg Ala Arg Pro Arg Leu Leu Leu 

3010 3015 

Phe Val Gly Val Gly Leu Phe Leu Leu 

3025 3030 
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Sequence ID No.fc 

Sequence Length: 9,5 1 1 

Sequence Type: nucleic acid 

Strandedness: single 

Topology: linear 

Molecule Type: genomic RNA 

Method for Determination of Feature: E 



GCCCGCCCCC 


UUAUbbbUbt 


uALALlR/luL 


LAUuAAUUAl/ 


iippppiip,iip,a 






I I AI 1 1 l/\ A AAA A 

UCUUCACGCA 


GAAAGCGUCU 


Abl/lAUGutu 


i in a pi ia mp ap 
UUAuUAUuAu 




ppnppAftp.pf 


190 

1 L V 


CCCCCCUCCC 


GGGAGAGCCA 


UAGUGGUCUG 


CGGAACCGGU 


GAGUACACCG 


GAAUUACCGG 


180 


AAAGACUGGG 


UCCUUUCUUG 


GAUAAACCCA 


CUCUAUGUCC 


GGUCAUUUGG 


GCACGCCCCC 


240 


GCAAGACUGC 


UAGCCGAGUA 


GCGUUGGGUU 


GCGAAAGGCC 


UUGUGGUACU 


GCCUGAUAGG 


300 


GURCUUGCGA 


GUGCCCCGGG 


AGGUCUCGUA 


GACCGUGCAU 


CAUGAGCACA 


AAUCCUAAAC 


360 


CUCAAAGAAA 


AACCAAAAGA 


AACACAAACC 


GCCGCCCACA 


GGACGUUAAG 


UUCCCGGGUG 


420 


GCGGUCAGAU 


CGUUGGCGGA 


GUUUACUUGC 


UGCCGCGCAG 


GGGCCCCAGG 


UUGGGUGUGC 


480 


GCGCGACAAG 


GAAGACUUCY 


GAGCGAUCCC 


AGCCGCGUGG 


ACGACGCCAG 


CCCAUCCCGA 


540 


AAGAUCGGCG 


CUCCACCGGC 


AAGUCCUGGG 


GAAAGCCAGG 


AUAUCCUUGG 


CCCCUGUACG 


600 


GAAACGAGGG 


UUGCGGCUGG 


GCGGGUUGGC 


UCCUGUCCCC 


CCGCGGGUCU 


CGUCCUACUU 


660 


GGGGCCCCAC 


CGACCCCCGG 


CAUAGAUCAC 


GCAAUUUGGG 


CAGAGUCAUC 


GAUACCAUUA 


720 


CGUGUGGUUU 


UGCCGACCUC 


AUGGGGUACA 


UCCCUGUCGU 


UGGCGCCCCG 


GUYGGAGGCG 


780 


UCGCCAGAGC 


UCUGGCACAC 


GGUGUUAGGG 


UCCUGGAGGA 


CGGGAUAAAU 


UACGCAACAG 


840 


GGAAUUUACC 


CGGUUGCUCU 


UUUUCUAUCU 


UUUUGCUUGC 


UCUUCUGUCA 


UGCGUCACAR 


900 


UGCCAGUGUC 


UGCAGUGGAA 


GUCAGGAACA 


UYAGUUCUAG 


CUACUACGCC 


ACUAAUGAUU 


960 


GCUCAAACAA 


CAGCAUCACC 


UGGCAGCUCA 


CUGACGCAGU 


UCUCCAUCUU 


CCUGGAUGCG 


1020 


UCCCAUGUGA 


GAAYGAUAAY 


GGCACCUUGC 


RUUGCUGGAU 


ACAAGUAACA 


CCCRACGUGG 


1080 


CUGUGAAACA 


CCGCGGUGCG 


CUCACUCGUA 


GCCUGCGAAC 


ACACGUCGAC 


AUGAUCGUAA 


1140 
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UGGCAGCUAC GGCCUGCUCG GCCUUGUAUG UGGGAGAUGU GUGCGGGGCC GUGAUGAUYC 1200 
UAUCGCAGGC UUUCAUGGUA UCACCACAAC GCCACAACUU CACCCAAGAG UGCAACUGUU 1260 
CCAUCUACCA AGGUCACAUC ACCGGCCAUC GCAUGGCAUG GGACAUGAUG CURARCUGGU 1320 
C'JCCAACUCU URCCAUGAUC CUCGCCUACG CYGCUCGYGU UCCCGARCUG GUCCUCGAAA 1380 
UYAUYUUCGG CGGCCAUUGG GGUGUGGYGU UYGGCUUGGS CUAUUUCUCC AUGCARGGAG 1440 
CGUGGGCCAA AGUCRUYGCC AUCCUCCUUC UUGUUGCGGG AGUGGAUGCA WCCACCUAUU 1500 
CCASCGGYCA GSAAGCGGGU CGURCCGYCK HKGGGWUCKC URGCCUCUUU AHUACUGGUG 1560 
CCAAGCAGAA CCUCYAUUUR AUCAACACCA AUGGCAGCUG GCACAUAAAC CGGACUGCCC 1620 
UCAAUUGCAA UGACAGCYJA SAGACGGGUU UCMUCGCUUC CYUGKUUUAC WHCCRCARGU 1680 
UCAACAGCUC UGGCUGCCCC GAGCGCUUGU CUUCCUGCCG CGGGCUGGAC GAYUUYCGCA 1740 
UCGGCUGGGG AACCUUGGAA UACGAAACCA ACGUCACCAA CGAUGRGGAC AUGAGGCCGU 1800 
ACUGCUGGCA UUACCCCCCG AGGCCUUGCG GCAUCGUCCC GGCUAGGACG GUUUGCGGAC 1860 
CGGUCUAUUG YUUCACCCCU AGCCCUGUUG UCGUGGGCAC CACUGACAAG CAGGGCGUAC 1920 
CCACCUACAC CUGGGGRGAA AACGAGACCG AUGUCUUCCU GCRAAAUAGC ACAAGACCCC 1980 
CGCGAGGAGC UUGGUUCGGC UGCACYUGGA UGAACGGGAC UGGGUUCACU AAGACAUGCG 2040 
GUGCACCACC UUGCCGCAliU AGGAAAGACU ACAACAGCAC UCUCGAUUUA UUGUGCCCCA 2100 
CAGACUGUUU UAGGAAGCAC CCAGAUGCUA CCUAUCUUAA GUGUGGAGCA GGGCCUUGGU 2160 
UAACUCCCAG GUGCCUGGUA GACUACCCUU AUAGRYUGUG GCAUUAUCCG UGCACUGUAA 2220 
ACUUCACCAU CUUYAAGGCG CGGAUGUAUG UAGGAGGGGU GGAGCAUCGA UUCUCCGCAG 2280 
CAUGCAACUU CACGCGCGGA GAUCGCUGCA GACUGGAAGA UAGGGAUAGG GGYCAGCAGA 2340 
GUCCACUGCU GCAUUCCACU ACUGAGUGGG CGGUGY'JCCC AUGCUCCUUC UCUGACCUAC 2400 
CAGCACUAUC CACUGGCCUA UUGCACCUCC ACCAAAACAL) CGUGGACGUG CAGUACCUYU 2460 
ACGGACUUUC UCCGGCUCUG ACAAGAUACA UCGUGAAGUG GGAGUGGGUG AUCCUCCUUU 2520 
UCUUGUUGUU GGCAGACGCC AGGRUCUGUG CAUGCCUUUG GAUGCUCAWC AUACUGGGCC 2580 
AAGCCGAAGC GGCGCUUGAG AAGCUCAUCA UCUUGCACUC CGCUAGYGCU GCUAGUGCCA 2640 
AUGGUCCGCU GUGGUUUUJC AUCUUCUUUA CAGCGGCCUG GUACUUAAAG GGCAGGGUGG 2700 
UCCCCGUGGC CACGUACUCU GUBCUCGGCU URUGGUCCUU CCUCCUCCUA GUCCUGGCYU 2760 
UACCACAGCA GGCUUAUGCC U'JGGACGCUG CUGAACAAGG GGAACUGGGG CUGGCCAUAU 2820 
UAGUAAUUAU AUCCAUCU'JU ACUCUUACCC CAGCAUACAA GAUCCUCCUG AGCCGUUCAG 2880 
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UGUGGUGGCU GUCCUACAUG CUGGJCJUGG CCGAGGCCCA GAUUCAGCAA UGGGUUCCCC 2940 
CCCUGGAGGU CCGAGGGGGG CGJGACGGGA UCAUCJGGGU GGCUGUCAUU CUACACCCAC 3000 
GCCUUGUGUU UGAGGUCACG AAAUGG JUGU UAGCAAUCCO GGGGCCUGCC UACCUCCUUA 3060 
RAGCGUCUCU GCUACGGAUA CCGUACJUUG UGAGGGCCCA CGCUJUGCUA CGAGUGUGUA 3120 
CCCUGGUGAA ACACCUCGCR GGGGCUAGGU ACAUCCAGAU GCUGUURAUC ACCAUAGGCA 3180 
GAUGGACCGG CACUUACAUC UACGACCACC UCUCCCCUUU AUCAACUUGG GCGGCCCAGG 3240 
GUUURCGGGA CCUGGCAAUC GCCGUGGAGC CUGUGGUGUU CAGCCCAAUG GAGAAGAAGG 3300 
UCAUUGUGUG GGGGGCUGAG ACAGUGGCGU GUGGAGACAU CCUGCAUGGC CUCCCGGUCU 3360 
CCGCGAGGCU AGGUAGGGAR GUUCUGCUCG GCCCUGCCGA CGGCUACACC UCCAAGGGGU 3420 
GGAAKCUCCU AGCUCCCAUU ACUGCUUACA CUCAGCAAAC UCGUGGUCUC CUGGGUGCUA 3480 
UCGUGGUCAG CCUAACGGGC CGCGACAAAA AUGAGCAGGC UGGGCAGGUC CAGGUUCUGU 3540 
CCUCCGUCAC ACAAACUUUC UUGGGGACAU CCAUUUCGGG CGUCCUCUGG ACAGUAUAUC 3600 
ACGGGGCUGG UAAUAAGACC UUGGCCGGCC CCAAGGGACC AGUCACUCAG AUGUACACCA 3660 
GCGCAGAAGG GGACCUCGUG GGAUGGCCUA GUCCCCCCGG GACUAAGUCA UUGGACCCCU 3720 
GUACCUGCGG GGCCGUAGAC CUCUACCUGG UCACCCGAAA CGCUGAUGUC AUUCCGGUCC 3780 
GGAGGAAAGA UGACCGACGG GGUGCAUUAC UCUCGCCAAG GCCCCUCUCA ACCCUCAAAG 3840 
GAQCAUCCGG AGGGCCCGUG CUCUGCUCWA GGGGACACGC CGUGGGCUUG UUCAGAGCGG 3900 
CCGUGUGUGC CAGGGGUGUA GCCAAAUCUA UUGACUUCAU CCCCGUCGAA UCACUCGAUR 3960 
UCGCCACACG GACGCCCAGU UUCUCUGACA ACAGURCGCC GCCAGCUGUG CCCCAGUCUU 4020 
\CCAGGUGGG UUACUUGCAC GCACCAACAG GCAGCGGAAA GAGCACCAAG GUCCCUGCCG 4080 
CGUAUGCCAG UCAGGGGUAU AAAGUACUCG UACUAAAUCC CUCUGUCGCG GCCACACUUG 4140 
GUUUUGGGGC CUACAUGUCC AAAGCCCACG GGAUCAACCC UAA UAUCAGA ACUGGAGUGC 4200 
GGACCGUUAC CACCGGGGAC UCUAUCACUU ACUCCACUUA UGGCAAGUUU AUCGCAGAUG 4260 
GAGGCUGUGC AGCCGGUGCC UAUGACAUCA UCAUAUGCGA CGAAUGCCAU UCAGUGGACG 4320 
CUACUACCAU CCUUGGCAUU GGAACAGUCC UUGACCAAGC UGAGACCGCA GGCGUCAGGC 4380 
UAGUGGUYUU GGCCACAGCC ACGCCUCCCG GUACGGUGAC AACUCCCCAC AGUAACAUAG 4440 
AGGAGGUGGC CCUUGGUCAC GAGGGCGAGA UCCCUUUUUA UGGCAAAGCU AUUCCCCUAG 4500 
CUUUCAUCAA GGGGGGCAGA CACUUGAUCU UUUGCCAUUC AAAGAAGAAG UGCGACGAGC 4560 
UCGCAGCGGC CCUCCGGGGC AYGGGUGUCA AUGCCGUUGC AUACUAUAGG GGUCUCGACG 4620 
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UC'JCCGUUAU ACCAACUCAA GGAGACGUGG UGGUUGUCGC CACUGAUGCC CUAAUGACUG 4680 
GGUACACCGG CGACUUUGAC JCYGUCAUCG ACUGUAAUGU UGCAGUCUCU CAGAUUGUUG 4740 
ACUUCAGCCU AGACCCAACC UUCACCAUCA CCACUCAAAC CGUCCCUCAG GACGCUGUCU 4800 
CCCGUAGUCA ACGUAGAGGG AGAACUGGGA GGGGGCGAUU GGGCRUUUAC AGGUAUGUUU 4860 
CGUCAGGYGA RRGGCCGUCU GGGAUGUUCG ACAGCGUAGU GCYCUGCGAG UGCUAUGAUG 4920 
CCGGGGCAGC CUGGUACGAG CUUACACCUG CUGAGACUAC GGUGAGACUC CGGGCYUAUU 4980 
UCAACACGCC CGGUUUGCCC GUAUGUCAAG ACCACCUGGA GUUCUGGGAA GCGGUCUUUA S040 
CAGGUCUCAC WCACAUURAC GCCCACUUCC UCUCCCAGAC GAAGCAAGGA GGAGAAAACU 5100 
UUGCRUAUCU AACGGCCUAC CAGGCCACAG UAUGCGCCAG GGCAAAGGCC CCUCCUCCUU 5160 
CGUGGGACGU GAUGUGGAAG UGUCUAACUA GGCUCAAACC JACACUGACU GGUCCCACCC 5220 
CCCUCCUGUA CCGCUUGGGU GCCGUGACCA AUGAGGUYAC CUUGACGCAC CCCGUGACGA 5280 
AAUACAUCGC CACGUGCAUG CAAGCUGACC UYGAGAUCAU GACAAGCUCA UGGGUCCUGG 5340 
CGGGGGGGGU GCUAGCCGCC GUGGCAGCUU ACUGCCUGGC GACUGGCUGC AUUUCCAUCA 5400 
iJUGGCCGCCU ACACCUGAAU GAUCGGGUGG UUGUGRCCCC YGACAAGGAR AUCJUAUAUG 5460 
AGGCCUUUGA UGAGAUGGAA GAAUGCGCCU CCAAAGCCGC CCUCAUUGAG GAAGGGCAGC 5520 
GGAUGGCGGA GAUGCUCAAA UCUAAGAUAC AAGGCCUCCU ACAACAGGCC ACAAGGCAAG 5580 
CUCAAGRCAU RCAGCCAGCU AUACAGUCAU CAUGGCCCAA GCUUGAACAA UUUUGGGCCA 5640 
AACACAUGUG GAACUUCAUC AGUGGUAUAC AGUACCUAGC AGGACUCUCC ACCCUACCGG 5 700 
GAAAUCCUGC AGURGCAUCA AUGAUGGCUU UUAGCGCCGC GCUGACUAGC CCACUACCCA 5 760 
CCAGCACCAC CAUCCUCUUG AACAUCAUGG GAGGAUGCUU GGCCUCYCAG AUUGCCCCCC 5820 
CUGCCGGAGC CACYGGCUUC GUUGUCAGUG GUCUAGUGGG GGCGGCCGUC GGAAGCAUAG 5880 
GCCUGGGUAA GAUACUGGDG GACGUUUUGG CCGGGUACGG CGCAGGCAUU UCAGGGGCCC 5940 
UCGUAGCJUU UAAGAUCAUG AGCGGCGAGA AGCCCACGGU AGAAGACGUU GUGAAUCUCC 6000 
UGCCUGCUAU YCUGUCUCCU GGUGCGYUGG UAGUGGGAGU CAUCUGUGCA GCAAUYCUGC 6060 
GCCGCCACGJ CGGUCAGGGA GAGGGRGCGG UCCAGUGGAU GAACAGACUG AUCGCCUUCG 6120 
CCUCCAGGGG AAACCACGUU GCCCCliACCC ACUACGUGGU GGAGUCUGAC GCUUCACAGC 6180 
GUGURACGCA GGUGCUGAGU UCACUliACAA UUACCAGCUU ACUUAGGAGA CUACAUGCCU 6240 
GGAUCACUGA AGAUUGCCCA RUCCCAUGCU CGGGGUCUUG GCUCCAGGAC AUUUGGGAUU 6300 
GGGUUUGUUC CAUCCUCACA GACUUYAAAA ACUGGCUGUC UUCAAAAUUA CUCCCCAAGA 6360 
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UGCCCGGCAU UCCCUUUAUC UCUUGCCAGA AGGGAUACAA GGGUGUAUGG GCUGGUACGG 6420 
GUGUCAUGAC YACUCGRURC CCAUGUGGAG CAAACAUCUC GGGCCAUGUC CGCAUGGGCA 6480 
CCAUGAAAAU AACAGGCCCG AAGACUUGCU UGAACCUGUG GCAGGGGACU UUCCCCAUUA 6540 
AUUGUUACAC AGAAGGGCCY UGCGUGCCAA AACCCCCUCC UAADUACAAG ACCGCAAUUU 6600 
GGAGGGUGGC AGCGUCGGAG UACGUUGAGG UCACACAGCA UGGCUCUUUC UCGUAUGUAA 6660 
CRGGGUUAAC CAGUGACAAC CUUAAGGUYC CUUGCCAGGU ACCAGCUCCA GAAUUUUUCU 6720 
CUUGGGUGGA CGGGGUGCAA AUCCACCGAU UCGCCCCCGU WCCAGGUCCC UUCUUUCGGG 6780 
AUGAGGUAAC GUUCACCGUA GGCCUUAACU CCUUCGUGGU CGGCUCUCAG CUCCCUUGCG 6840 
AUCCUGAGCC GGACACCGAR GUACUGGCCU CYAUGUUGAC AGACCCGUCC CACAUCACCG 6900 
CKGAGGCGGC AGCCAGGCGA UUGGCAAGGG GAUCUCCCCC YUCACAGGCU AGCUCCUCAG 6960 
CGAGCCAGCU CUCUGCCCCG UCCUUGAAGG CUACCUGUAC CACCCAUAAG ACAGCAUAUG 7020 
AUUGUGACAU GGUGGAUGCY AACCUUUUCA UGGGAGGHGA UGUGAYCCGG AUUGAGUCUG 7080 
ACUCUAAGGU GAUCGUUCUA GACUCCCUCG AUUCCAUGAC UGAGGUAGAG GAUGAUCGUG 7140 
AGCCUUCUGU ACCAUCAGAG UACCUGAUCA AGAGGAGAAA GUUCCCACCG GCGCUGCCUC 7200 
CUUGGGCCCG UCCAGACUAC AAUCCUGUUU UGAUCGAGAC AUGGAAGAGG CCGGGCUAUG 7260 
AACCACCCAC UGUCCUAGGC UGUGCCCUCC CCCCCACACY UCAAACGCCA GUGCCUCCAC 7320 
CUCGGAGGCG CCGCGCYAAA RUCCUGACCC AGGACRAUGU GGAGGGGRUC CUCAGGGAGA 7380 
UGGCUGACAA AGURCUCAGC CCUCUCCAAG ACAACAAUGA CUCCGGUCAC UCCACUGGAG 7440 
CGGAUACCGG AGGAGACAUC GUCCAGCAAC CCUCUGACGA GACUGCCGCU UCAGAAGCGG 7500 
GGUCACUGUC CUCCAUGCCU CCCCUUGAGG GAGAGCCGGG AGACCCYGAC CUGGAGUUUG 7560 
AACCAGUGGG AUCCGCUCCC CCUUCUGAGG GGGAGUGUGA GGUCAUUGAU UCGGACUCUA 7620 
AGUCGUGGUC CACAGUCUCU GAUCAAGAGG AUUCUGUUAU CUGCUGCUCU AUGUCAUACU 7680 
CCUGGACGGG GGCCCUCAUA ACACCAUGUG GGCCCGAAGA GGAGAAGUUA CCGAUCAACC 7740 
CUC UGAGUAA UUCGCUCAUG CGGUUCCAUA AYAAGGUGUA CUCCACAACC UCGAGGAGUG 7800 
CCUCUCUGAG GGCAAAGAAG GUGACUUUUG ACAGGGUGCA GGUGCUGGAC GCACACUAUG 7860 
ACUCAGUCUU GCAGGACGUU AAGCGGGCCG CCUCUAAGGU URGUGCGAGG CUCCUCACAG 7920 
UAGAGGAAGC CUGCGCGCUG ACCCCGCCCC ACUCCGCCAA AUCGCGAUAC GGAUUUGGGG 7980 
CAAAAGAGGU GCGCAGCUUA UCCAGGAGGG CCGUUAACCA CAUCCGGUCC GUGUGGGAGG 8040 
ACCUCCUGGA AGACCAACRU ACCCCAAUUG ACACAACUAU CAUGGCUAAA AAUGAGGUGU 8100 
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UCUGCAUUGA UCCAACUAAR GGUGGGAAAA 
UUGGGGUCAG GGUGUGCGAA AAGAUGGCCC 
CGAUAAUGGG GCCAUCCUAU GGGUUCCAAU 
UCAAAGCUUG GGGAAGUAAG AAGGACCCAA 
ACUCAACCGU CACGGAGAGG GACAUAAGAA 
UGCCUCAAGA AGCCAGAACU GUCAUACACU 
CCAUGACAAA CAGCAAAGGG CAAUCCUGCG 
UCACCACCAG CAUGGGGAAU ACCAUGACAU 
CUGCRGGGAU CGUGGACCCU GUUAUGUUGG 
AGAGCCAAGG UAACGAGGAG GACGAGCGAA 
GGUAUUCCGC CCCUCCCGGU GACCUUCCCA 
CCUGCUCCUC AAACGUAUCG GUAGCGCUGG 
CCAGAGACCC UACCACUCCA AUCACCCGAG 
UCAAUUCUUG GCUGGGCAAC AUCAUCCAGU 
UAAUGACUCA CUUCUUCUCC AUACUAUUGG 
UUGAGAUGUA CGGGGCAGUA UACUCGGUCA 
GGCUACAUGG GCUUGAAGCC UUUUCACUGC 
UGGCAGCAAC UCUCAGAAAA CUUGGAGCGC 
G UGCCGUGAG AGCUUCACUC AUCGCCCAAG 
L'OiJUCAACUG GGCGGUGAAA ACAAAGCUCA 
UGGAUUUAUC CGGGUGGUUC ACCGUGGGCG 
CGCAUGCYCG ACCCCGCCUA UUACUCCUUU 
UCUUUUUACU CCCCGCUCGG UAGAGCGGCA 
GUUUUUUUUU UUUUUUUUUU UUUUUUUUUU 



AGCCAGCUCG CCUCAUCGUA UACCCCGACC S160 
UCUAUGACAU CRCACAAAAG CUUCCCAAAG 8220 
ACUCUCCCGC AGAACGGGUC GAUUUCCUCC 8280 
UGGGGUUCUC GUAUGACACC CGCUGCUUUG 8340 
CAGAAGAAUC CAUAUAUCAG GCUUGUUCUC 8400 
CGCUCACUGA GAGACUUUAC GUAGGAGGGC 8460 
GCUACAGGCG UUGCCGCGCA AGCGGKGUUU 8520 
GUUACAUCAA AGCCCUUGCA GCGUGUAAGG 8580 
UGUGUGGAGA CGACCUGGUC GUCAUCUCAG 8640 
ACCUGAGAGC UUUCACGGAG GCUAUGACCA 8700 
GACCGGAAUA UGACUUGGAG CUUAUAACAU 8760 
ACUCUCGGGG UCGCCGCCGG UACUUCCUAA 8820 
CUGCUUGGGA AACAGUAAGA CACUCCCCUG 8880 
ACGCCCCCAC AAUCUGGGUC CGGAUGGUCA 8940 
CCCAGGACAC UCUGAACCAA AAUCUCAAUU 9000 
AUCCAUUAGA CCUACCGGCC AUAAUUGAAA 9060 
ACACAUACUC UCCCCACGAA CUCUCACGGG 9120 
CUCCCCUUAG AGCGUGGAAG AGUCGGGCGC 9180 
GAGCGAGGGC GGCCAUUUGU GGCCGCUACC 9240 
AACUCACUCC AUUGCCCGAG GCGAGCCGCC 9300 
CCGGCGGGGG CGACAUUUAU CACAGCGUGU 9360 
GCCUACUCCU ACUUAGCGUA GGAGUAGGCA 9420 
AACYCUAGCU ACACUCCAUA GCUAGUUUCC 9480 
U 9511 
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Sequence ID No. 7 
Sequence Length: 9.51 1 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



GCCCGCCCCC 


TGATGGGGGC 


GACACTCCGC 


CATGAATCAC 


TCCCCTGTGA 


GGAACTACTG 


60 


TCT TCACGCA 


GAAAGCGTCT 


AGCCATGGCG 


TTAGTATGAG 


TGTCGTACAG 


CCTCCAGGCC 


120 


CCCCCCTCCC 


GGGAGAGCCA 


TAGTGGTCTG 


CGGAACCGGT 


GAGTACACCG 


GAATTACCGG 


180 


AAAGACTGGG 


TCCTTTCTTG 


GATAAACCCA 


CTCTATGTCC 


GGTCATTTGG 


GCACGCCCCC 


240 


GCAAGACTGC 


TAGCCGAGTA 


GCGTTGGGTT 


GCGAAAGGCC 


TTGTGGTACT 


GCCTGATAGG 


300 


GTRCTTGCGA 


GTGCCCCGGG 


AGGTCTCGTA 


GACCGTGCAT 


CATGAGCACA 


AATCCTAAAC 


360 


CTCAAAGAAA 


AACCAAAAGA 


AACACAAACC 


GCCGCCCACA 


GGACGTTAAG 


TTCCCGGGTG 


420 


GCGGTCAGAT 


CGTTGGCGGA 


GTTTACTTGC 


TGCCGCGCAG 


GGGCCCCAGG 


TTGGGTGIGC 


480 


GCGCGACAAG 


GAAGACTTCY 


GAGCGATCCC 


AGCCGCGTGG 


ACGACGCCAG 


CCCATCCCGA 


540 


AAGATCGGCG 


CTCCACCGGC 


AAGTCCTGGG 


GAAAGCCAGG 


ATATCCTTGG 


CCCCTGTACG 


600 


GAAACGAGGG 


TTGCGGCTGG 


GCGGGTTGGC 


TCCTGTCCCC 


CCGCGGGTCT 


CGTCCTACTT 


660 


GGGGCCCCAC 


CGACCCCCGG 


CATAGATCAC 


GCAATTTGGG 


CAGAGTCATC 


GATACCATTA 


720 


CGTGTGGTTT 


TGCCGACCTC 


ATGGGGTACA 


TCCCTGTCGT 


TGGCGCCCCG 


GTYGGAGGCG 


780 


TCGCCAGAGC 


TCTGGCACAC 


GGTGTTAGGG 


TCCTGGAGGA 


CGGGATAAAT 


TACGCAACAG 


840 


GGAATTTACC 


CGGTTGCTCT 


TTTTCTATCT 


TTTTGCTTGC 


TCTTCTGTCA 


TGCGTCACAR 


900 


TGCCAGTGTC 


TGCAGTGGAA 


GTCAGGAACA 


TYAGTTCTAG 


CTACTACGCC 


ACTAATGATT 


960 


GCTCAAACAA 


CAGCATCACC 


TGGCAGCTCA 


CTGACGCAGT 


TCTCCATCTT 


CCTGGATGCG 


1020 


TCCCATGTGA 


GAAYGATAAY 


GGCACCTTGC 


RTTGCTGGAT 


ACAAGTAACA 


CCCRACGTGG 


1080 


CTGTGAAACA 


CCGCGGTGCG 


CTCACTCGTA 


GCCTGCGAAC 


ACACGTCGAC 


ATGATCGTAA 


1140 


TGGCAGCTAC 


GGCCTGCTCG 


GCCTTGTATG 


TGGGAGATGT 


GTGCGGGGCC 


GTGATGATYC 


1200 
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IATCGCAGGC TTTCATGGTA TCACCACAAC GCCACAACTT CACCCAAGAG TGCAACTG TT 1260 
CCAICTACCA AGGTCACATC ACCGGCCATC GCATGGCATG GGACA1GATG CTRARCTGGT 1320 
CTCCAACTCT TRCCATGATC CTCGCCTACG CYGCTCGYGT TCCCGARCTG GTCCTCGAAA 1380 
TYATYTTCGG CGGCCATTGG GGTGTGGYGT TYGGCTTGGS CTATTTCTCC ATGCARGGAG 1440 
CGTGGGCCAA AGTCRTYGCC ATCCTCCTTC TTGTTGCGGG AGTGGATGCA WCCACCTATT 1500 
CCASCGGYCA GSAAGCGGGT CGTRCCGYCK HKGGGWTCKC TRGCCTCTTT AHTACTGGTG 1560 
CCAAGCAGAA CCTCYATTTR ATCAACACCA ATGGCAGCTG GCACATAAAC CGGACTGCCC 1620 
TCAATTGCAA TGACAGCYTA SAGACGGGTT TCHTCGCTTC CYTGKTTTAC WHCCRCARGT 1680 
TCAACAGCTC TGGCTGCCCC GAGCGCTTGT CTTCCTGCCG CGGGCTGGAC GAYTTYCGCA 1740 
TCGGCTGGGG AACCTTGGAA TACGAAACCA ACGTCACCAA CGATGRGGAC ATGAGGCCGT 1800 
ACTGCTGGCA TTACCCCCCG AGGCCTTGCG GCATCGTCCC GGCTAGGACG GTTTGCGGAC 1860 
CGGTCTATTG YTTCACCCCT AGCCCTGTTG TCGTGGGCAC CACTGACAAG CAGGGCGTAC 1920 
CCACCTACAC CTGGGGRGAA AACGAGACCG ATGTCTTCCT GCTRAATAGC ACAAGACCCC 1980 
CGCGAGGAGC TTGGTTCGGC TGCACYTGGA TGAACGGGAC TGGGTTCACT AAGACATGCG 2040 
GTGCACCACC TTGCCGCATT AGGAAAGACT ACAACAGCAC TCTCGATTTA TTGTGCCCCA 2100 
CAGACTGTTT TAGGAAGCAC CCAGATGCTA CCTATCTTAA GTGTGGAGCA GGGCCTTGGT 2160 
TAACTCCCAG GTGCCTGGTA GACTACCCTT ATAGRYTGTG GCATTATCCG TGCACTGTAA 2220 
ACTTCACCAT CTTYAAGGCG CGGATGTATG TAGGAGGGGT GGAGCATCGA TTCTCCGCAG 2280 
CATGCAACTT CACGCGCGGA GATCGCTGCA GACTGGAAGA TAGGGATAGG GGYCAGCAGA 2340 
GTCCACTGCT GCATTCCACT ACTGAGTGGG CGGTGYTCCC ATGCTCCTTC TCTGACCTAC 2400 
CAGCACTATC CACTGGCCTA TTGCACCTCC ACCAAAACAT CGTGGACGTG CAGTACCTYT 2460 
ACGGACTTTC TCCGGCTCTG ACAAGATACA TCGTGAAGTG GGAGTGGGTG ATCCTCCTTT 2520 
TCTTGTTGTT GGCAGACGCC AGGRTCTGTG CATGCCTTTG GATGCTCAWC ATACTGGGCC 2580 
AAGCCGAAGC GGCGCTTGAG AAGCTCATCA TCTTGCACTC CGCTAGYGCT GCTAGTGCCA 2640 
ATGGTCCGCT GTGGTTTTTC ATCTTCTTTA CAGCGGCCTG GTACTTAAAG GGCAGGGTGG 2 700 
TCCCCGTGGC CACGTACTCT GTBCTCGGCT TRTGGTCCTT CCTCCTCCTA GTCCTGGCYT 2760 
TACCACAGCA GGCTTATGCC TTGGACGCTG CTGAACAAGG GGAACTGGGG CTGGCCATAT 2820 
TAGTAATTAT ATCCATCNT ACTCTTACCC CAGCATACAA GATCCTCCTG AGCCGTTCAG 2880 
TGTGGTGGCT GTCCTACATG CTGGTCTTGG CCGAGGCCCA GATTCAGCAA TGGGTTCCCC 2940 
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CCCTGGAGGT 


CCGAGGGGGG 


CGTGACGGGA 


GCCTTGTGTT 


TGAGGTCACG 


AAATGGTTGT 


RAGCGTCTCT 


GCTACGGATA 


CCGTACTTTG 


CCCTGGTGAA 


ACACCTCGCR 


GGGGCTAGGT 


GATGGACCGG 


CACTTACATC 


TACGACCACC 


GTTTRCGGGA 


CCTGGCAATC 


GCCGTGGAGC 


TCATTGTGTG 


GGGGGCTGAG 


ACAGTGGCGT 


CCGCGAGGCT 


AGGTAGGGAR 


GTTCTGCTCG 


GGAAKCTCCT 


AGCTCCCATT 


ACTGCTTACA 


TCGTGGTCAG 


CCTAACGGGC 


CGCGACAAAA 


CCTCCGTCAC 


ACAAACTTTC 


TTGGGGACAT 


ACGGGGCTGG 


TAATAAGACC 


TTGGCCGGCC 


GCGCAGAAGG 


GGACCTCGTG 


GGATGGCCTA 


GTACCTGCGG 


GGCCGTAGAC 


CTCTACCTGG 


GGAGGAAAGA 


TGACCGACGG 


GGTGCATTAC 


GATCATCCGG 


AGGGCCCGTG 


CTCTGCTCWA 


CCGTGTGTGC 


CAGGGGTGTA 


GCCAAATCTA 


TCGCCACACG 


GACGCCCAGT 


TTCTCTGACA 


ACCAGGTGGG 


TTACTTGCAC 


GCACCAACAG 


CGTATGCCAG 


TCAGGGGTAT 


AAAGTACTCG 


GTTTTGGGGC 


CTACATGTCC 


AAAGCCCACG 


GGACCGTTAC 


CACCGGGGAC 


TCTATCACTT 


GAGGCTGTGC 


AGCCGGTGCC 


TATGACATCA 


CTACTACCAT 


CCTTGGCATT 


GGAACAGTCC 


TAGTGGTYTT 


GGCCACAGCC 


ACGCCTCCCG 


AGGAGGTGGC 


CCTTGGTCAC 


GAGGGCGAGA 


CTTTCATCAA 


GGGGGGCAGA 


CACTTGATCT 


TCGCAGCGGC 


CCTCCGGGGC 


AYGGGTGTCA 


TCTCCGTTAT 


ACCAACTCAA 


GGAGACGTGG 



TCATCTGGGT GGCTGTCATT CTACACCCAC 3000 
TAGCAATCCT GGGGCCTGCC TACCTCCTTA 3060 
TGAGGGCCCA CGCTTTGCTA CGAGTGTGTA 3120 
ACATCCAGAT GCTGTTRATC ACCATAGGCA 3180 
TCTCCCCTTT ATCAACTTGG GCGGCCCAGG 3240 
CTGTGGTGTT CAGCCCAATG GAGAAGAAGG 3300 
GTGGAGACAT CCTGCATGGC CTCCCGGTCT 3360 
GCCCTGCCGA CGGCTACACC TCCAAGGGGT 3420 
CTCAGCAAAC TCGTGGTCTC CTGGGTGCTA 3480 
ATGAGCAGGC TGGGCAGGTC CAGGTTCTGT 3540 
CCATTTCGGG CGTCCTCTGG ACAGTATATC 3600 
CCAAGGGACC AGTCACTCAG ATGTACACCA 3660 
GTCCCCCCGG GACTAAGTCA TTGGACCCCT 3720 
TCACCCGAAA CGCTGATGTC ATTCCGGTCC 3780 
TCTCGCCAAG GCCCCTCTCA ACCCTCAAAG 3840 
GGGGACACGC CGTGGGCTTG TTCAGAGCGG 3900 
TTGACTTCAT CCCCGTCGAA TCACTCGATR 3960 
ACAGTRCGCC GCCAGCTGTG CCCCAGTCTT 4020 
GCAGCGGAAA GAGCACCAAG GTCCCTGCCG 4080 
TACTAAATCC CTCTGTCGCG GCCACACTTG 4140 
GGATCAACCC TAATATCAGA ACTGGAGTGC 4200 
ACTCCACTTA TGGCAAGTTT ATCGCAGATG 4260 
TCATATGCGA CGAATGCCAT TCAGTGGACG 4320 
TTGACCAAGC TGAGACCGCA GGCGTCAGGC 4380 
GTACGGTGAC AACTCCCCAC AGTAACATAG 4440 
TCCCTTTTTA TGGCAAAGCT ATTCCCCTAG 4500 
TTTGCCATTC AAAGAAGAAG TGCGACGAGC 4560 
ATGCCGTTGC ATACTATAGG GGTCTCGACG 4620 
TGGTTGTCGC CACTGATGCC CTAATGACTG 4680 
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GGTACACCGG 


CGACTTTGAC 


TCYGTCATCG 


ACTGTAATGT 


TGCAGTCTCT 


CAGATTGTTG 


4740 


ACTTCAGCCT 


AGACCCAACC 


TTCACCATCA 


CCACTCAAAC 


CGTCCCTCAG 


GACGCTGTCT 


4800 


CCCGTAGTCA 


ACGTAGAGGG 


AGAACTGGGA 


GGGGGCGATT 


GGGCRTTTAC 


AGGTATGTTT 


4860 


CGTCAGGYGA 


RRGGCCGTCT 


GGGATGTTCG 


ACAGCGTAGT 


GCYCTGCGAG 


TGCTATGATG 


4920 


CCGGGGCAGC 


CTGGTACGAG 


CTTACACCTG 


CTGAGACTAC 


GGTGAGACTC 


CGGGCYTATT 


4980 


TCAACACGCC 


CGGTTTGCCC 


GTATGTCAAG 


ACCACCTGGA 


GTTCTGGGAA 


GCGGTCTTTA 


5040 


CAGGTCTCAC 


WCACATTRAC 


GCCCACTTCC 


TCTCCCAGAC 


GAAGCAAGGA 


GGAGAAAACT 


5100 


TTGCRTATCT 


AACGGCCTAC 


CAGGCCACAG 


TATGCGCCAG 


GGCAAAGGCC 


CCTCCTCCTT 


5160 


CGTGGGACGT 


GATGTGGAAG 


TGTCTAACTA 


GGCTCAAACC 


TACACTGACT 


GGTCCCACCC 


5220 


CCCTCCTGTA 


CCGCTTGGGT 


GCCGTGACCA 


ATGAGGTYAC 


CTTGACGCAC 


CCCGTGACGA 


5280 


AATACATCGC 


CACGTGCAIG 


CAAGCTGACC 


TYGAGATCAT 


GACAAGCTCA 


TGGGTCCTGG 


5340 


CGGGGGGGGT 


GCTAGCCGCC 


GTGGCAGCTT 


ACTGCCTGGC 


GACTGGCTGC 


ATTTCCATCA 


5400 


TTGGCCGCCT 


ACACCTGAAT 


GATCGGGTGG 


TTGTGRCCCC 


YGACAAGGAR 


ATCTTATATG 


5460 


AGGCCTTTGA 


TGAGATGGAA 


GAATGCGCCT 


CCAAAGCCGC 


CCTCATTGAG 


GAAGGGCAGC 


5520 


GGATGGCGGA 


GATGCTCAAA 


TCTAAGATAC 


AAGGCCTCCT 


ACAACAGGCC 


ACAAGGCAAG 


5580 


CTCAAGRCAT 


RCAGCCAGCT 


ATACAGTCAT 


CATGGCCCAA 


GCTTGAACAA 


TTTTGGGCCA 


5640 


AACACATGTG 


GAACTTCATC 


AGTGGTATAC 


AGTACCTAGC 


AGGACTCTCC 


ACCCTACCGG 


5 700 


GAAATCCTGC 


AGTRGCATCA 


ATGATGGCTT 


TTAGCGCCGC 


GCTGACTAGC 


CCACTACCCA 


5 760 


CCAGCACCAC 


CATCCTCTTG 


AACATCATGG 


GAGGATGCTT 


GGCCTCYCAG 


ATTGCCCCCC 


5820 


CTGCCGGAGC 


CACYGGCTTC 


GTTGTCAGTG 


GTCTAGTGGG 


GGCGGCCGTC 


GGAAGCATAG 


5880 


GCCTGGGTAA 


GATACTGGIG 


GACGilTTGG 


CCGGGTACGG 


CGCAGGCATT 


TCAGGGGCCC 


5940 


TCGTAGCTTT 


TAAGATCATG 


AGCGGCGAGA 


AGCCCACGGT 


AGAAGACGTT 


GTGAATCTCC 


6000 


TGCCTGCTAT 


YCTGTCTCCT 


GGTGCGYTGG 


TAGTGGGAGT 


CATCTGTGCA 


GCAATYCTGC 


6060 


GCCGCCACGT 


CGGTCAGGGA 


GAGGGRGCGG 


TCCAGTGGAT 


GAACAGACTG 


ATCGCCTTCG 


6120 


CCTCCAGGGG 


AAACCACGTT 


GCCCCTACCC 


ACTACGTGGT 


GGAGTCTGAC 


GCTTCACAGC 


6180 


GTGTRACGCA 


GGTGCTGAGT 


TCACTTACAA 


TTACCAGCTT 


ACTTAGGAGA 


CTACATGCCT 


6240 


GGATCACTGA 


AGATTGCCCA 


RTCCCATGCT 


CGGGGTCTTG 


GCTCCAGGAC 


ATTTGGGATT 


6300 


GGGTTTGTTC 


CATCCTCACA 


GACTTYAAAA 


ACTGGCTGTC 


TTCAAAATTA 


CTCCCCAAGA 


6360 


TGCCCGGCAT 


TCCCTTTATC 


TCTTGCCAGA 


AGGGATACAA 


GGGTGTATGG 


GCTGGTACGG 


6420 
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GIGTCATGAC YACTCGRTRC CCATGTGGAG CAAACATCTC GGGCCATGTC CGCATGGGCA ^ 
CCATGAAAAT AACAGGCCCG AAGACTTGCT TGAACCTGTG GCAGGGGACT TTCCCCATTA 
ATTGTTACAC AGAAGGGCCY TGCGTGCCAA AACCCCCTCC TAATTACAAG ACCGCAATTT 560* 
GGAGGGTGGC AGCGTCGGAG TACGTTGAGG TCACACAGCA TGGCTCTTTC TCGTATGTAA 86G" 
CRGGGTTAAC CAGTGACAAC CTTAAGGIYC CTTGCCAGGT ACCAGCTCCA GAATTTTTCT 67^ 
CTTGGGTGGA CGGGGTGCAA ATCCACCGAT TCGCCCCCGT WCCAGGTCCC TTCNTCGGG 6780 
ATGAGGTAAC GTTCACCGTA GGCCTTAACT CCTTCGTGGT CGGCTCTCAG CTCCCTTGCG 6840 
ATCCTGAGCC GGACACCGAR GTACTGGCCT CYATGTTGAC AGACCCGTCC CACATCACCG 6900 
CKGAGGCGGC AGCCAGGCGA TTGGCAAGGG GATCTCCCCC YTCACAGGCT AGCTCCTCAG 6960 
CGAGCCAGCT CTCTGCCCCG TCCTTGAAGG CTACCTGTAC CACCCATAAG ACAGCATATG 70?0 
ATTGTGACAT GGIGGATGCY AACCTTTTCA TGGGAGGHGA TGTGAYCCGG ATTGAGTCTG 7080 
ACTCTAAGGT GATCGTTCTA GACTCCCTCG ATTCCATGAC TGAGGTAGAG GATGATCGTG 7140 
AGCCTTCTGT ACCATCAGAG TACCTGATCA AGAGGAGAAA GTTCCCACCG GCGCTGCCTC 7?00 
CTTGGGCCCG TCCAGACTAC AATCCTGTTT TGATCGAGAC ATGGAAGAGG CCGGGCTATG 7?60 
AACCACCCAC TGTCCTAGGC TGTGCCCTCC CCCCCACACY TCAAACGCCA GTGCCTCCAC 7320 
CTCGGAGGCG CCGCGCYAAA RTCCTGACCC AGGACRATGT GGAGGGGRTC CTCAGGGAGA 7380 
TGGCTGACAA AGTRCTCAGC CCTCTCCAAG ACAACAATGA CTCCGGTCAC TCCACTGGAG 7440 
CGGATACCGG AGGAGACATC GTCCAGCAAC CCTCTGACGA GACTGCCGCT TCAGAAGCGG 7<0C 
GGTCACTGTC CTCCATGCCT CCCCTTGAGG GAGAGCCGGG AGACCCYGAC CTGGAGTTTG 7560 
AACCAGTGGG ATCCGCTCCC CCTTCTGAGG GGGAGTGTGA GGTCATTGAT TCGGACTCTA 7620 
AGTCGTGGTC CACAGTCTCT GATCAAGAGG ATTCTGTTAT CTGCTGCTCT ATGTCATACT 7680 
CCTGGACGGG GGCCCTCATA ACACCATGTG GGCCCGAAGA GGAGAAGTTA CCGATCAACC 7740 
CTCTGAGTAA TTCGCTCATG CGGTTCCATA AYAAGGTGTA CTCCACAACC TCGAGGAGTG 7800 
CCTCTCTGAG GGCAAAGAAG GTGACTTTTG ACAGGGTGCA GGTGCTGGAC GCACACTATG 7860 
ACTCAGTCTT GCAGGACGTT AAGCGGGCCG CCTCTAAGGT TRGTGCGAGG CTCCTCACAG 79?0 
TAGAGGAAGC CTGCGCGCTG ACCCCGCCCC ACTCCGCCAA ATCGCGATAC GGATTTGGGG 7980 
CAAAAGAGGT GCGCAGCTTA TCCAGGAGGG CCGTTAACCA CATCCGGTCC GTGTGGGAGG 8040 
ACCTCCTGGA AGACCAACRT ACCCCAATTG ACACAACTAT CATGGCTAAA AATGAGGTGT 8100 
TCTGCATTGA TCCAACTAAR GGTGGGAAAA AGCCAGCTCG CCTCATCGTA TACCCCGACC 8160 
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TTGGGGTCAG 
CGATAATGGG 
TCAAAGCTTG 
ACTCAACCGT 
TGCCTCAAGA 
CCATGACAAA 
TCACCACCAG 
CTGCRGGGAT 
AGAGCCAAGG 
GGTATTCCGC 
CCTGCTCCTC 
CCAGAGACCC 
TCAATTCTTG 
TAATGACTCA 
TTGAGATGTA 
GGCTACATGG 
TGGCAGCAAC 
GTGCCGTGAG 
TCTTCAACTG 
TGGATTTATC 
CGCATGCYCG 
TCTTTTTACT 
GTTTTTTTTT 



GGTGTGCGAA 
GCCATCCTAT 
GGGAAGTAAG 
CACGGAGAGG 
AGCCAGAACT 
CAGCAAAGGG 
CATGGGGAAT 
CGTGGACCCT 
TAACGAGGAG 
CCCTCCCGGT 
AAACGTATCG 
TACCACTCCA 
GCTGGGCAAC 
CTTCTTCTCC 
CGGGGCAGTA 
GCTTGAAGCC 
TCTCAGAAAA 
AGCTTCACTC 
GGCGGTGAAA 
CGGGTGGTTC 
ACCCCGCCTA 
CCCCGCTCGG 
TTTTTTTTTT 



AAGATGGCCC 
GGGTTCCAAT 
AAGGACCCAA 
GACATAAGAA 
GTCATACACT 
CAATCCTGCG 
ACCATGACAT 
GTTATGTTGG 
GACGAGCGAA 
GACCTTCCCA 
GTAGCGCTGG 
ATCACCCGAG 
ATCATCCAGT 
ATACTATTGG 
TACTCGGTCA 
TTTTCACTGC 
CTTGGAGCGC 
ATCGCCCAAG 
ACAAAGCTCA 
ACCGTGGGCG 
TTACTCCTTT 
TAGAGCGGCA 
TTTTTTTTTT 



TCTATGACAT 
ACTCTCCCGC 
TGGGGTTCTC 
CAGAAGAATC 
CGCTCACTGA 
GCTACAGGCG 
GTTACATCAA 
TGTGTGGAGA 
ACCTGAGAGC 
GACCGGAATA 
ACTCTCGGGG 
CTGCTTGGGA 
ACGCCCCCAC 
CCCAGGACAC 
ATCCATTAGA 
ACACATACTC 
CTCCCCTTAG 
GAGCGAGGGC 
AACTCACTCC 
CCGGCGGGGG 
GCCTACTCCT 
AACYCTAGCT 
T 9511 



CRCACAAAAG 
AGAACGGGTC 
GTATGACACC 
CATATATCAG 
GAGACTTTAC 
TTGCCGCGCA 
AGCCCTTGCA 
CGACCTGGTC 
TTTCACGGAG 
TGACTTGGAG 
TCGCCGCCGG 
AACAGTAAGA 
AATCTGGGTC 
TCTGAACCAA 
CCTACCGGCC 
TCCCCACGAA 
AGCGTGGAAG 
GGCCATTTGT 
ATTGCCCGAG 
CGACATTTAT 
ACTTAGCGTA 
ACACTCCATA 



CTTCCCAAAG 8220 
GATTTCCTCC 8280 
CGCTGCTTTG 8340 
GCTTGTTCTC 8400 
GTAGGAGGGC 8460 
AGCGGKGTTT 8520 
GCGTGTAAGG 8580 
GTCATCTCAG 8640 
GCTATGACCA 8700 
CTTATAACAT 8760 
TACTTCCIAA 8820 
CACTCCCCTG 8880 
CGGATGGTCA 8940 
AATCTCAATT 9000 
ATAATTGAAA 9060 
CTCTCACGGG 9120 
AGTCGGGCGC 9180 
GGCCGCTACC 9240 
GCGAGCCGCC 9300 
CACAGCGTGT 9360 
GGAGTAGGCA 9420 
GCTAGTTTCC 9480 
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Sequence ID No. >i 
Sequence Length: 3,033 
Sequence Type: amino acid 
Topology: linear 
Molecule Type: protein 



Het Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys 'Arg Asn Thr 
5 10 15 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He 
20 25 30 

Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 
35 40 45 

Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly 
50 55 60 

Arg Arg Gin Pro He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser 
65 70 75 

Trp Gly Lys Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 
80 85 90 

Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro 
95 100 105 

Thr Trp Gly Pro Thr Asp Pro Arg His Arg Ser Arg Asn Leu Gly 
11 0 115 120 

Arg val He Asp Thr lie Thr Cys Gly Phe Ala Asp Leu Het Gly 
125 130 135 

Tyr He Pro Val Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala 
14 0 145 150 

Leu Ala His Gly Val Arg Val Leu Glu Asp Gly He Asn Tyr Ala 
155 160 165 
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Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala 

170 175 180 

Leu Leu Ser Cys Val Thr Val Pro Val Ser Ala Val Glu Val Arg 

1 85 1 90 1 95 

Asn Me Ser Ser Ser Tyr Tyr Ala Thr Asn Asp Cys Ser Asn Asn 

200 205 210 

Ser He Thr Trp Gin Leu Thr Asp Ala Val Leu His Leu Pro Gly 

215 220 225 

Cys Val Pro Cys Glu Asn Asp Asn Gly Thr Leu His Cys Trp He 

230 235 240 

Gin Val Thr Pro Asn Val Ala Val Lys His Arg Gly Ala Leu Thr 

245 250 255 

Arg Ser Leu Arg Thr His Val Asp Met He Val Met Ala Ala Thr 

260 265 2 70 

Ala Cys Ser Ala Leu Tyr Val Gly Asp Val Cys Gly Ala Val Met 

2 75 230 285 

lie Leu Ser Gin Ala Phe Met Val Ser Pro Gin Arg His Asn Phe 

290 295 300 

Thr Gin Glu Cys Asn Cys Ser He Tyr Gin Gly His He Thr Gly 

305 310 315 

His Arg Het Ala Trp Asp Her Met leu Ser Trp Ser Pro Thr Leu 

320 325 330 

Thr Het lie Leu Aia Tyr Ala Ala Arg Val Pro Glu Leu Val Leu 

335 340 345 

Glu He He Phe Gly Gly His Trp Gly Val Val Phe Gly Leu Ala 

350 355 360 

Tyr Phe Ser Met Gin Gly Ala Trp Ala Lys Val He Ala He Leu 

365 370 375 

Leu Leu Val Ala Gly Val Asp Ala Thr Thr Tyr Ser Ser Gly Gin 
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380 385 390 

Glu Ala Gly Arg Thr Val Ala Gly Phe Ala Gly Leu Phe Thr Thr 

395 400 405 

Gly Ala Lys Gin Asn Leu Tyr Leu He Asn Thr Asn Gly Ser Trp 

410 415 420 

His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Gin Thr 

425 430 435 

Gly Phe Leu Ala Ser Leu Phe Tyr Thr His Lys Phe Asn Ser Ser 

440 445 450 

Gly Cys Pro Glu Arg Leu Ser Ser Cys Arg Gly Leu Asp Asp Phe 

455 460 465 

Arg He Gly Trp Gly Thr Leu Glu Tyr Glu Thr Asn Val Thr Asn 

470 475 480 

Asp Gly Asp Het Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro 

485 490 495 

Cys Gly lie Val Pro Ala Arg Thr Val Cys Gly Pro Val Tyr Cys 

500 505 510 

Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Lys Gin Gly 

515 520 525 

Val Pro Thr Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu 

530 535 540 

Leu Asn Ser Thr Arg Pro Pro Arg Gly Ala Trp Phe Gly Cys Thr 

545 550 555 
Trp Het Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Ala Pro Pro 

560 565 570 
Cys Arg lie Arg Lys Asp Tyr Asn Ser Thr He Asp Leu Leu Cys 

575 580 585 

Pro Thr Asp Cys Phe Arg Lys His Pro Asp Ala Thr Tyr Leu Lys 

590 595 600 
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Cys Gly Ala Gly Pro T td Leu Thr Pro Arg Cys Leu Val Asd Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He 

620 625 630 

Phe Lys Ala Arg Met Tyr Val Gly Gly Val G I u His Arg Phe Ser 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Arg Leu Glu Asp 

650 655 660 

Arg Asp Arg Gly Gin Gin Ser Pro Leu Leu His Ser Thr Thr Glu 

665 670 675 

Trp Ala Val Leu Pro Cys Ser Phe Ser Asp Leu Pro Ala Leu Ser 

680 685 690 

Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin Tyr 

695 700 705 

Leu Tyr Gly Leu Ser Pro Ala Leu Thr Arg Tyr He Val Lys Trp 

710 715 720 

Glu Trp Val lie Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg He 

725 730 735 

Cys Ala Cys Leu Trp Met Leu lie He Leu Gly Gin Ala Glu Ala 

740 745 750 

Ala Leu Glu Lys Leu lie He Leu His Ser Ala Ser Ala Ala Ser 

755 760 765 

Ala Asn Gly Pro Leu Trp Phe Phe He Phe Phe Thr Ala Ala Trp 

770 775 780 

Tyr Leu Lys Gly Arg Val Val Pro Val Ala Thr Tyr Ser Val Leu 

785 790 795 

Gly Leu Trp Ser Phe Leu Leu Leu Val Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Leu Asp Ala Ala Glu Gin Gly Glu Leu Gly Leu Ala 
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815 820 825 

lie Leu Val He He Ser lie Phe Thr Leu Thr Pro Ala Tyr Lys 

830 835 840 

He Leu Leu Ser Arg Ser Val Trp Trp Leu Ser Tyr Het Leu Val 

845 850 855 

Leu Ala Glu Ala Gin He Gin Gin Trp Val Pro Pro Leu Glu Val 

860 865 870 

Arg Gly Gly Arg Asp Gly Me He Trp Val Ala Val He Leu His 

875 880 885 

Pro Arg Leu Val Phe Glu Val Thr Lys Trp Leu Leu Ala lie Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Lys Ala Ser Leu Leu Arg He Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Val Cys Thr Leu Val Lys 

920 925 930 

His Leu Ala Gly Ala Arg Tyr He Gin Met Leu Leu lie Thr He 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Ser Pro Leu 

950 955 960 

Ser Thr Trp Ala Ala Gin Gly Leu Arg Asp Leu Ala He Ala Val 

965 970 975 

Glu Pro Val Val Phe Ser Pro Het Glu Lys Lys Val He Val Trp 

980 985 990 

Gly Ala Glu Thr Val Ala Cys Gly Asp He Leu His Gly Leu Pro 

995 1000 1005 

Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 

Gly Tyr Thr Ser Lys Gly Trp Lys Leu Leu Ala Pro lie Thr Ala 

1025 1030 1035 
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Tyr Thr Gin Gin Thr Arg Gly leu Leu Gly Ala He Val Val Ser 

1040 1045 1050 

Leu Thr Gly Arg Asp Lys Asn G I u Gin Ala Gly Gin Val Gin Val 

1055 1060 1065 

Leu Ser Ser Val Thr Gin Thr Phe Leu Gly Thr Ser lie Ser Gly 

1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 

1085 1090 1095 

Gly Pro Lys Gly Pro Val Thr Gin Het Tyr Thr Ser Ala Glu Gly 

1100 1105 1110 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Asp 

1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asd Leu Tyr Leu Val Thr Arg Asn 

1130 1135 1140 

Ala Asp Val He Pro Val Arg Arg Lys Asp Asp Arg Arg Gly Ala 

1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 

1160 1165 1170 

Gly Pro Val Leu Cys Ser Arg Gly His Ala Val Gly Leu Phe Arg 

1175 1180 1185 

Ala Ala Val Cys Ala Arg Gly Val Ala Lys Ser He Asp Phe He 

1190 1195 1200 

Pro Val Glu Ser Leu Asp Val Ala Thr Arg Thr Pro Ser Phe Ser 

1205 1210 1215 

Asp Asn Ser Thr Pro Pro Ala Val Pro Gin Ser Tyr Gin Val Gly 

1220 1225 1230 

Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

1235 1240 1245 

Ala Ala Tyr Ala Ser Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
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1250 



1255 



1260 



Ser Val Ala Ala Thr Leu G ! y Phe Gly Ala Tyr Met Ser Lys Ala 

1265 1270 1275 

His Gly He Asn Pro Asn He Arg Thr Gly Val Arg Thr Val Thr 

1280 1285 1290 

Thr Gly Asp Ser He Thr Tyr Ser Thr Tyr Gly Lys Phe He Ala 

1295 1300 1305 

Asp Gly Gly Cys Ala Ala Gly Ala Tyr Asp He He He Cys Asp 

1310 1315 1320 

Glu Cys His Ser Val Asp Ala Thr Thr He Leu Gly He Gly Thr 

1325 1330 1335 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu Val Val Leu 

1340 1345 1350 

Ala Thr Ala Thr Pro Pro Gly Thr Val Thr Thr Pro His Ser Asn 

1355 1360 1365 

He Glu Glu Val Ala Leu Gly His Glu Gly Glu He Pro Phe Tyr 

1370 1375 1380 
Gly Lys Ala He Pro Leu Ala Phe He Lys Gly Gly Arg His Leu 

1385 1390 1395 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Ala 

1400 1405 1410 
Leu Arg Gly Met Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu 

1415 1420 1425 

Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 

1430 1435 1440 
Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val 

1445 1450 1455 
He Asp Cys Asn Val Ala Val Ser Gin He Val Asp Phe Ser Leu 



1460 



1465 



1470 
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Asp Pro Thr Phe Thr lie Thr Thr Gin Thr Val Pro Gin Asp Ala 

1475 1480 1485 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Giy Arg Gly Arg Leu 

1490 1495 1500 

Gly Val Tyr Arg Tyr Val Ser Ser Gly Glu Arg Pro Ser Gly Her 

1505 1510 1515 

Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala Gly Ala Ala 

1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 

1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 

1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asp Ala His 

1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Ala Tyr Leu 

1580 1585 1590 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 

1595 1600 1605 

Pro Ser Trp Asp Val Met Trp Lys Cys Leu Thr Arg Leu Lys Pro 

1610 1615 1620 

Ihr Leu Thr Giy Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 

1625 1630 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr He Ala 

1640 1645 1650 

Thr Cys Het Gin Ala Asp Leu Glu lie Het Thr Ser Ser Trp Val 
1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 
1670 1675 1680 

Thr Gly Cys He Ser lie lie Gly Arg Leu His Leu Asn Asp Arg 
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1685 1690 1695 

Val Vai Val Ala Pro Asd Lys Glu lie Leu Tyr Glu Ala Phe Asp 
1700 1705 1710 

Glu Hec Glu Glu Cys Ala Ser Lys Ala Ala Leu He Glu Glu Gly 
1715 1720 1725 

Gin Arg He- Ala Glu Met Leu Lys Ser Lys lie Gin Gly Leu Leu 
1730 1735 1740 

Gin Gin Ala Thr Arg Gin Ala Gin Asp He Gin Pro Ala lie Gin 
1745 1750 1755 

Ser Ser Trp Pro Lys Leu Glu Gin Phe Trp Ala Lys His Het Trp 
1760 1765 1770 

Asn Phe He Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu 
1775 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Het Het Ala Phe Ser Ala Ala 
1790 1795 1800 

Leu Thr Ser Pro Leu Pro Thr Ser Thr Thr He Leu Leu Asn lie 
1805 1810 1815 

Het Gly Gly Trp Leu Ala Ser Gin He Ala Pro Pro Ala Gly Ala 
1820 1825 1830 

Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 
1835 1840 1845 

He Gly Leu Gly Lys lie Leu Val Asp Val Leu Ala Gly Tyr Gly 
1850 1855 1860 

Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys lie Het Ser Gly 
1865 1870 1875 

Glu Lys Pro Thr Val Glu Asp Val Val Asn Leu Leu Pro Ala He 
1880 1885 1890 

Leu Ser Pro Gly Ala Leu Val Val Gly Val lie Cys Ala Ala He 
1895 1900 1905 
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Leu Arg Arg His Val Gly Gin Gly Glu Gly Ala Val Gin Trp Met 

1910 1915 1920 

Asn Arg Leu He Ala Phe Aia Ser Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Val Glu Ser Asp Ala Ser Gin Arg Val Thr Gin 

1940 1945 1950 

Val Leu Ser Ser Leu Thr He Thr Ser Leu Leu Arg Arg Leu His 

1955 1960 1965 

Ala Trp He Thr Glu Asp Cys Pro Val Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Gin Asp He Trp Asp Trp Val Cys Ser He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Ser Ser Lys Leu Leu Pro Lys Hec Pro Gly He 

2000 2005 2010 

Pro Phe He Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly Val Het Thr Thr Arg Cys Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly His Val Arg Het Gly Thr Het Lys He Thr Gly Pro Lys Thr 

2045 2050 2055 

Cys Leu Asn Leu Trp Gin Gly Thr Phe Pro He Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Pro Cys Val Pro Lys Pro Pro Pro Asn Tyr Lys Thr Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Val Glu Val Thr Gin His 

2090 2095 2100 

Gly Ser Phe Ser Tyr Val Thr Gly Leu Thr Ser Asp Asn Leu Lys 

2105 2110 2115 

Val Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Ser Trp Val Asp 
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2120 2 1 25 2130 

Giy Val Gin ile His Arg Phe Ala Pro Val Pro Gly Pro Phe Phe 

2135 2140 2145 

Arg Asp Glu Val Thr Phe Thr Val Gly Leu Asn Ser Phe Val Val 

2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Glu Val Leu 

2165 2170 2175 

Ala Ser Het Leu Thr Asp Pro Ser His Ile Thr Ala Glu Ala Ala 

2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Gin Ala Ser Ser 

2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 

2210 2215 2220 

Thr His Lys Thr Ala Tyr Asp Cys Asp Het Val Asp Ala Asn Leu 

2225 2230 2235 

Phe Het Gly Gly Asp Val Thr Arg Ile Glu Ser Asp Ser Lys Val 

2240 2245 2250 

lie Val Leu Asp Ser Leu Asp Ser Het Thr Glu Val Glu Asp Asp 

2255 2260 2265 

Arg Glu Pro Ser Val Pro Ser Glu Tyr Leu Ile Lys Arg Arg Lys 

22 70 2275 2280 

Phe Pro Pro Ala Leu Pro Pro Trp Ala Arg Pro Asp Tyr Asn Pro 

2285 2290 2295 

Val Leu Ile Glu Thr Trp Lys Arg Pro Gly Tyr Glu Pro Pro Thr 

2300 2305 2310 

Val Leu Gly Cys Ala Leu Pro Pro Thr Pro Gin Thr Pro Val Pro 

2315 2320 2325 

Pro Pro Arg Arg Arg Arg Ala Lys Val Leu Thr Gin Asp Asn Val 

2330 2335 2340 
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Glu Gly Val Leu Arg Glu Het Ala Asp Lys Val Leu Ser Pro Leu 

2345 2350 2355 

Gin Asp Asn Asn Asp Ser Gly His Ser Thr Gly Ala Asp Thr Gly 

2360 2365 2370 

Gly Asp Me Val Gin Gin Pro Ser Asp Glu Thr Ala Ala Ser Glu 

2375 2380 2385 

Ala Gly Ser Leu Ser Ser Het Pro Pro Leu Glu Gly Glu Pro Gly 

2390 2395 2400 

Asp Pro Asp Leu Glu Phe Glu Pro Val Gly Ser Ala Pro Pro Ser 

2405 2410 2415 

Glu Gly Glu Cys Glu Val He Asp Ser Asp Ser Lys Ser Trp Ser 

2420 2425 2430 

Thr Val Ser Asp Gin Glu Asd Ser Val He Cys Cys Ser Het Ser 

2435 2440 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Gly Pro Glu Glu 

2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Het Arg Phe 

2465 2470 2475 

His Asn Lys Val Tyr Ser Thr Thr Ser Arg Ser Ala Ser Leu Arg 

2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg Val Gin Val Leu Asp Ala His 

2495 2500 2505 

Tyr Asp Ser Val Leu Gin Asp Val Lys Arg Ala Ala Ser Lys Val 

2510 2515 2520 

Ser Ala Arg Leu Leu Thr Val Glu Glu Ala Cys Ala Leu Thr Pro 

2525 2530 2535 

Pro His Ser Ala Lys Ser Arg Tyr Gly Phe Gly Ala Lys Glu Val 

2540 2545 2550 

Arg Ser Leu Ser Arg Arg Ala Val Asn His lie Arg Ser Val Trp 
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2555 2560 2565 

G lu Asn Leu Leu Glu Asp Gin His Thr Pro He Asp Thr Thr He 

2570 2575 2580 

Met Ala Lys Asn Glu Val Phe Cys He Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Pro Ala Arg Leu He Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 2610 

Val Cys Glu Lys Met Ala Leu Tyr Asp He Ala Gin Lys Leu Pro 

2615 2620 2625 

Lys Ala He Het Gly Pro Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Glu Arg Val Asp Phe Leu Leu Lys Ala Trp Gly Ser Lys Lys Asp 

2645 2650 2655 

Pro Het Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp He Arg Thr Glu Glu Ser He Tyr Gin Ala Cys 

2675 2680 2685 

Ser Leu Pro Gin Glu Ala Arg Thr Val lie His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Het Thr Asn Ser Lys Gly Gin Ser 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 

2720 2725 2730 

Het Gly Asn Thr Het Thr Cys Tyr He Lys Ala Leu Ala Ala Cys 

2735 2740 2745 

Lys Ala Ala Gly He Val Asp Pro Val Het Leu Val Cys Gly Asp 

2750 2755 2760 

Asp Leu Val Val He Ser Glu Ser Gin Gly Asn Glu Glu Asp Glu 

2765 2770 2775 
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Arc Asn Leu Arg Ala Phe Thr G I u Ala Met Thr Arg Tyr Ser Ala 
2780 2785 2790 

Pro Pro Gly Asp Leu Pro Arg Pro Glu Tyr Asp Leu Glu Leu lie 
2795 2800 2805 

Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Asp Ser Arg Gly 
2810 2815 2820 

Arg Arg Arg Tyr Phe Leu Thr Arg Asp Pro Thr Thr Pro Me Thr 
2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 
2840 2845 2850 

Leu Gly Asn He He Gin Tyr Ala Pro Thr lie Trp Val Arg Met 
2855 2860 2865 

Val He Het Thr His Phe Phe Ser lie Leu Leu Ala Gin Asd Thr 
2870 2875 2880 

Leu Asn Gin Asn Leu Asn Phe Glu Met Tyr Gly Ala Val Tyr Ser 
2885 2890 2895 

Val Asn Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly 
2900 2905 2910 

Leu Glu Ala Phe Ser Leu His Thr Tyr Ser Pro His Glu Leu Ser 
2915 2920 2925 

Arg Val Ala Ala Thr Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 
2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ala 
2945 2950 2955 

Gin Gly Ala Arg Ala Ala He Cys Gly Arg Tyr Leu Phe Asn Trp 
2960 2965 2970 

Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Leu Pro Glu Ala Ser 
2975 2980 2985 

Arg Leu Asp Leu Ser Gly Trp Phe Thr Val Gly Ala Gly Gly Gly 
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2990 2995 3000 

Asp lie Tyr His Ser Va I Ser His Ala Arg Pro Arg Leu Leu Leu 

3005 3010 3015 

Leu Cys Leu Leu Leu Leu Ser Val Gly Val Giy He Phe Leu Leu 

3020 3025 3030 

Pro Ala Arg 
3033 
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Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr lys Arg Asn Thr 

5 10 15 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie 

20 25 30 

Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 

35 40 45 

Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly 

50 55 60 

Arg Arg Gin Pro He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser 

65 70 75 

Trp Gly Lys Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 

80 85 90 

Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro 

95 100 105 

Thr Trp Gly Pro Thr Asp Pro Arg His Arg Ser Arg Asn Leu Gly 

110 115 120 
Arg Val He Asp Thr lie Thr Cys Gly Phe Ala Asp Leu Met Gly 

125 130 135 
Tyr lie Pro Val Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala 

140 145 150 
Leu Ala His Gly Val Arg Val Leu Giu Asp Gly lie Asn Tyr Ala 

155 160 165 
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Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala 

170 175 180 

Leu Leu Ser Cys Val Thr Het Pro Val Ser Ala Val Glu Vai Arg 

185 190 195 

Asn He Ser Ser Ser Tyr Tyr Ala Thr Asn Asp Cys Ser Asn Asn 

200 205 210 

Ser He Thr Trp Gin Leu Thr Asp Ala Val Leu His Leu Pro Gly 

215 220 225 

Cys Val Pro Cys Glu Asn Asp Asn Gly Thr Leu Arg Cys Trp He 

230 235 240 

Gin Val Thr Pro Asp Val Ala Val Lys His Arg Gly Ala Leu Thr 

245 250 255 

Arg Ser Leu Arg Thr His Val Asp Het He Val Het Ala Ala Thr 

260 265 270 

Ala Cys Ser Ala Leu Tyr Val Gly Asp Val Cys Gly Ala Val Het 

275 280 285 

He Leu Ser Gin Ala Phe Het Val Ser Pro Gin Arg His Asn Phe 

290 295 300 

Thr Gin Glu Cys Asn Cys Ser He Tyr Gin Gly His lie Thr Gly 

305 310 315 

His Arg Het Ala Trp Asp Het Het Leu Asn Trp Ser Pro Thr Leu 

320 325 330 

Ala Het He Leu Ala Tyr Ala Ala Arg Val Pro Glu Leu Val Leu 

335 340 345 

Glu He He Phe Gly Gly His Trp Gly Val Ala Phe Gly Leu Gly 

350 355 360 

Tyr Phe Ser Het Gin Gly Ala Trp Ala Lys Val Val Ala lie Leu 

365 370 375 

Leu Leu Val Ala Gly Val Asp Ala Ser Thr Tyr Ser Thr Gly Gin 
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380 385 390 

Gin Ala Gly Arg Ala Ala Tyr Gly He Ser Ser Leu Phe Asn Thr 

395 400 405 

Gly Ala Lys Gin Asn Leu His Leu He Asn Thr Asn Gly Ser Trp 

410 415 420 

His He Asn Arg Thr Ala Leu Asn Cys Asn Aso Ser Leu Glu Thr 

425 430 435 

Gly Phe lie Ala Ser Leu Val Tyr Tyr Arg Arg Phe Asn Ser Ser 

440 445 450 

Gly Cys Pro Glu Arg Leu Ser Ser Cys Arg Gly Leu Asp Asp Phe 

455 460 465 

Arg He Gly Trp Gly Thr Leu Glu Tyr Glu Thr Asn Val Thr Asn 

470 475 480 

Asp Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro 

485 490 495 

Cys Gly He Va! Pro Ala Arg Thr Val Cys Gly Pro Val Tyr Cys 

500 505 510 

Phe Thr Pro Ser Pro Val Val Va i Gly Thr Thr Asp Lys Gin Gly 

515 520 525 

Val Pro Thr Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu 

530 535 540 

Leu Asn Ser Thr Arg Pro Pro Arg Gly Ala Trp Phe Gly Cys Thr 

545 550 555 

Trp Het Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Ala Pro Pro 

560 565 5 70 

Cys Arg He Arg Lys Asp Tyr Asn Ser Thr He Asp Leu Leu Cys 

575 580 585 

Pro Thr Asp Cys Phe Arg Lys His Pro Asp Ala Thr Tyr Leu Lys 

590 595 600 
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Cys Gly Ala Gly Pro Trp Leu Thr Pro Are Cys Leu Val Asp Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He 

620 625 630 

Phe Lys Ala Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Ser 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Arg Leu Glu Asp 

650 655 660 

Arg Asp Arg Gly Gin Gin Ser Pro Leu Leu His Ser Thr Thr Glu 

665 670 675 

Trp Ala Val Phe Pro Cys Ser Phe Ser Asp Leu Pro Ala Leu Ser 

680 685 690 

Thr Gly Leu Leu His Leu His Gin Asn lie Val Asp Val Gin Tyr 

695 700 705 

Leu Tyr Gly Leu Ser Pro Ala Leu Thr Arg Tyr He Val Lys Trp 

710 715 720 

Glu Trp Val J ! e Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val 

725 730 735 

Cys Ala Cys Leu Trp Met Leu Asn He Leu Gly Gin Ala Glu Ala 

740 745 750 

Ala Leu Glu Lys Leu He He Leu His Ser Ala Ser Ala Ala Ser 

755 760 765 

Ala Asn Gly Pro Leu Trp Phe Phe He Phe Phe Thr Ala Ala Trp 

770 775 780 

Tyr Leu Lys Gly Arg Val Val Pro Val Ala Thr Tyr Ser Val Leu 

785 790 795 

Gly Leu Trp Ser Phe Leu Leu Leu Val Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Leu Asp Ala Ala Glu Gin Gly Glu Leu Gly Leu Ala 
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815 



820 



lie Leu Val He He Ser He Phe Thr Leu Tip Pro Ala Tyr Lys 

830 835 840 

He Leu Leu Ser Arg Ser Val Trp Trp Leu Ser Tyr Met Leu Val 

845 850 855 

Leu Ala Glu Ala Gin lie Gin Gin Trp Val Pro Pro Leu Glu Val 

860 865 870 

Arg Gly Gly Arg Asp Gly He He Trp Val Ala Val He Leu His 

875 880 885 

Pro Arg Leu Val Phe Glu Val Thr Lys Trp Leu Leu Ala He Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Arg Ala Ser Leu Leu Arg He Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Val Cys Thr Leu Val Lys 

920 925 930 

His Leu Ala Gly Ala Arg Tyr He Gin Met Leu Leu He Thr He 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Ser Pro Leu 

950 955 960 

Ser Thr Trp Ala Ala Gin Gly Leu Arg Asp Leu Ala He Ala Val 

965 970 975 

Glu Pro Val Val Phe Ser Pro Met Glu Lys Lys Val He Val Trp 

980 985 990 
Gly Ala Glu Thr Val Ala Cys Gly Asp He Leu His Gly Leu Pro 

995 1000 1005 
Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 
Gly Tyr Thr Ser Lys Gly Trp Asn Leu Leu Ala Pro He Thr Ala 
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Tyr Thr Gin Gin Thr Arg Gly Leu Leu G ! y Ala lie Val Val Ser 

1040 1045 1050 

Leu Thr Gly Arg Asp Lys Asn Glu Gin Ala Gly Gin Val Gin Val 

1055 1060 1065 

Leu Ser Ser Val Thr Gin Thr Phe Leu Gly Thr Ser He Ser Gly 

1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 

1085 1090 1095 

Gly Pro Lys Gly Pro Val Thr Gin Het Tyr Thr Ser Ala Glu Gly 

1100 1105 1110 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Asp 

1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn 

1130 1135 1140 

Ala Asp Val He Pro Val Arg Arg Lys Asp Asp Arg Arg Gly Ala 

1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 

1160 1165 1170 

Gly Pro Val Leu Cys Ser Arg Gly His Ala Val Gly Leu Phe Arg 

1175 1180 1185 

Ala Ala Val Cys Ala Arg Gly Val Ala Lys Ser He Asp Phe He 

1190 1195 1200 

Pro Val Glu Ser Leu Asp He Ala Thr Arg Thr Pro Ser Phe Ser 

1205 1210 1215 

Asp Asn Ser Ala Pro Pro Ala Val Pro Gin Ser Tyr Gin Val Gly 

1220 1225 1230 

Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

1235 1240 1245 

Ala Ala Tyr Ala Ser Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
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1250 1 255 1260 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala 

1265 1270 1275 

His Gly He Asn Pro Asn He Arg Thr Gly Val Arg Thr Val Thr 

1280 1285 1290 

Thr Gly Asp Ser He Thr Tyr Ser Thr Tyr Gly Lys Phe He Ala 

1295 1300 1305 

Asp Gly Gly Cys Ala Ala Gly Ala Tyr Asp He He He Cys Asp 

1310 1315 1320 

G I u Cys His Ser Val Asp Ala Thr Thr He Leu Gly He Gly Thr 

1325 1330 1335 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu Val Val Leu 

1340 1345 1350 

Ala Thr Ala Thr Pro Pro Gly Thr Val Thr Thr Pro His Ser Asn 

1355 1360 1365 

He Glu Glu Val Ala Leu Gly His Glu Gly Glu He Pro Phe Tyr 

1370 1375 1380 

Gly Lys Ala He Pro Leu Ala Phe He Lys Gly Gly Arg His Leu 

1385 1390 1395 

lie Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Ala 

1400 1405 1410 

Leu Arg Gly Thr Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu 

1415 1420 1425 

Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 

1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val 

1445 1450 1455 

He Asp Cys Asn Val Ala Val Ser Gin He Val Asp Phe Ser Leu 

1460 1465 1470 
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Asp Pro Thr Phe Thr He Thr Thr Gin Thr Val Pro Gin Asp Ala 
U75 1480 U85 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Giy Arg Leu 
1490 U95 1500 

Gly He Tyr Arg Tyr Val Ser Ser Gly Glu Gly Pro Ser Gly Het 
1505 1510 1515 

Phe Asp Ser Val Val Pro Cys Glu Cys Tyr Asp Ala Gly Ala Ala 
1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 
1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 
1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asn Ala His 
1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Ala Tyr Leu 
1580 1585 1590 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 
1595 1600 1605 

Pro Ser Trp Asp Val Het Trp Lys Cys Leu Thr Arg Leu Lys Pro 
1610 1615 1620 

Thr Leu Thr Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
1625 1630 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr He Ala 
1640 1645 1650 

Thr Cys Het Gin Ala Asp Leu Glu He Het Thr Ser Ser Trp Val 
1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 
1670 1675 1680 

Thr Gly Cys He Ser He He Gly Arg Leu His Leu Asn Asp Arg 
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1685 1690 1695 

Val Val Val Thr Pro Asp Lys Glu He Leu Tyr Glu Ala Phe Asp 

1700 1705 1710 

Glu Het Glu Glu Cys Ala Ser Lys Ala Ala Leu He Glu Glu Gly 

1715 1720 1725 

Gin Arg Het Ala Glu Het Leu Lys Ser Lys He Gin Gly Leu Leu 

1730 1735 1740 

Gin Gin Ala Thr Arg Gin Ala Gin Gly Het Gin Pro Ala He Gin 

1745 1750 1755 

Ser Ser Trp Pro Lys Leu Glu Gin Phe Trp Ala Lys His Het Trp 

1760 1765 1770 

Asn Phe He Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu 

1775 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Het Het Ala Phe Ser Ala Ala 

1790 1795 1800 

Leu Thr Ser Pro Leu Pro Thr Ser Thr Thr He Leu Leu Asn He 

1805 1810 1815 

Het Gly Gly Trp Leu Ala Ser Gin lie Ala Pro Pro Ala Gly Ala 

1820 1825 1830 

Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 

1335 1840 1345 

He Gly Leu Gly Lys He Leu Val Asp Val Leu Ala Gly Tyr Gly 

1850 1855 1860 

Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys lie Het Ser Gly 

1365 1870 1875 

Glu Lys Pro Thr Val Glu Asp Val Val Asn Leu Leu Pro Ala lie 

1880 1885 1890 

Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala He 
1895 1900 1905 
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Leu Arg Arg His Val Gly Gin Gly Glu Gly Ala Val Gin Ttd Met 

1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Val Glu Ser Asd Ala Ser Gin Arg Val Thr Gin 

1940 1945 1950 

Val Leu Ser Ser Leu Thr lie Thr Ser Leu Leu Arg Arg Leu His 

1955 1960 1965 

Ala Trp He Thr Glu Asp Cys Pro He Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Gin Asp He Trp Asp Trp Val Cys Ser He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Ser Ser Lys Leu Leu Pro Lys Her Pro Gly lie 

2000 2005 2010 

Pro Phe He Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly Val Met Thr Thr Arg Tyr Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly His Val Arg Met Gly Thr Met Lys He Thr Gly Pro Lys Thr 

2045 2050 2055 

Cys Leu Asn Leu Trp Gin Gly Thr Phe Pro lie Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Pro Cys Val Pro Lys Pro Pro Pro Asn Tyr Lys Thr Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Val Glu Val Thr Gin His 

2090 2095 2100 

Gly Ser Phe Ser Tyr Val Thr Gly Leu Thr Ser Asp Asn Leu Lys 

2105 2110 2115 

Val Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Ser Trp Val Asp 
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2120 



2125 



2130 



Gly Val Gin lie His Arg 3 he Ala Pro Vai Pro Gly Pro Phe Phe 

2135 2140 2145 

Arg Asp Glu Va! Thr Phe Thr Val Gly Leu Asn Ser Phe Val Val 

2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Glu Val Leu 

2165 21 70 21 75 

Ala Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala 

2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Gin Ala Ser Ser 

2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 

2210 2215 2220 

Thr His Lys Thr Ala Tyr Asp Cys Asp Het Val Asp Ala Asn Leu 

2225 2230 2235 

Phe Het Gly Gly Asp Val Thr Arg He Glu Ser Asd Ser Lys Val 

2240 2245 2250 

He Val Leu Asp Ser Leu Asp Ser Her Thr Glu Val Glu Asp Asp 

2255 2260 2265 

Arg Glu Pro Ser Val Pro Ser Glu Tyr Leu He Lys Arg Arg Lys 

2270 2275 2280 

Phe Pro Pro Ala Leu Pro Pro Trp Ala Arg Pro Asp Tyr Asn Pro 

22S5 2290 2295 

Val Leu He Glu Thr Trp Lys Arg Pro Gly Tyr Glu Pro Pro Thr 

2300 2305 2310 

Val Leu Gly Cys Ala Leu Pro Pro Thr Leu Gin Thr Pro Val Pro 

2315 2320 2325 

Pro Pro Arg Arg Arg Arg Ala Lys He Leu Thr Gin Asp Asp Val 
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Glu Gly He Leu Arg Glu Met Ala Asp Lys Val Leu Ser Pro Leu 

2345 2350 2355 

Gin Asp Asn Asn Asp Ser Gly His Ser Thr Gly Ala Asp Thr Gly 

2360 2365 2370 

Gly Asp He Val Gin Gin Pro Ser Asp Glu Thr Ala Ala Ser Glu 

2375 2380 2385 

Ala Gly Ser Leu Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 

2390 2395 2400 

Asp Pro Asp Leu Glu Phe Glu Pro Val Gly Ser Ala Pro Pro Ser 

2405 2410 2415 

Glu Gly Glu Cys Glu Val He Asd Ser Asp Ser Lys Ser Trp Ser 

2420 2425 2430 

Thr Val Ser Asp Gin Glu Asp Ser Val He Cys Cys Ser Met Ser 

2435 2440 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Gly Pro Glu Glu 

2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Met Arg Phe 

2465 24 70 2475 

His Asn Lys Val Tyr Ser Thr Thr Ser Arg Ser Ala Ser Leu Arg 

2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg val Gin Val Leu Asp Ala His 

2495 2500 2505 

Tyr Asp Ser Val Leu Gin Asp Val Lys Arg Ala Ala Ser Lys Val 

2510 2515 2520 

Gly Ala Arg Leu Leu Thr Val Glu Glu Ala Cys Ala Leu Thr Pro 

2525 2530 2535 

Pro His Ser Ala Lys Ser Arg Tyr Gly Phe Gly Ala Lys Glu Val 

2540 2545 2550 

Arg Ser Leu Ser Arg Arg Ala Val Asn His lie Arg Ser Val Trp 
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2555 



2560 



2565 



Giu Asn Leu Leu Glu Asp Gin Arg Thr Pro He Asp Thr Thr He 

2570 2575 2580 

Het Ala Lys Asn Glu Val Phe Cys He Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Pro Ala Arg Leu He Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 2610 

Val Cys Glu Lys Het Ala Leu Tyr Asd He Thr Gin Lys Leu Pro 

2615 2620 2625 

Lys Ala He Het Gly Pro Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Glu Arg Val Asp Phe Leu Leu Lys Ala Trp Gly Ser Lys Lys Asp 

2645 2650 2655 

Pro Het Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp He Arg Thr Glu Glu Ser He Tyr Gin Ala Cys 

2675 2680 2685 

Ser Leu Pro Gin Glu Ala Arg Thr Val He His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Het Thr Asn Ser Lys Gly Gin Ser 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 

2720 2725 2730 

Het Gly Asn Thr Het Thr Cys Tyr He Lys Ala Leu Ala Ala Cys 

2735 2740 2745 

Lys Ala Ala Gly lie Val Asp Pro Val Het Leu Val Cys Gly Asp 

2750 2755 2760 

Asp Leu Val Val lie Ser Glu Ser Gin Gly Asn Glu Glu Asp Glu 
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Arg Asn Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 

2780 2785 2790 

Pro Pro Gly Asp Leu Pro Arg Pro Glu Tyr Asp Leu Glu Leu He 

2795 2800 2805 

Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Asp Ser Arg Gly 

2810 2815 2820 

Arg Arg Arg Tyr Phe Leu Thr Arg Asp Pro Thr Thr Pro He Thr 

2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 

2840 2845 2850 

Leu Gly Asn He He Gin Tyr Ala Pro Thr He Trp Val Arg Met 

2855 2860 2865 

Val He Het Thr His Phe Phe Ser He Leu Leu Ala Gin Asp Thr 

2870 2875 2880 

Leu Asn Gin Asn Leu Asn Phe Glu Her Tyr Gly Ala Val Tyr Ser 

2885 2890 2895 

Val Asn Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly 

2900 2905 2910 

Leu Glu Ala Phe Ser Leu His Thr Tyr Ser Pro His Glu Leu Ser 

2915 2920 2925 

Arg Val Ala Ala Thr Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 

2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ala 

2945 2950 2955 

Gin Gly Ala Arg Ala Ala He Cys Gly Arg Tyr Leu Phe Asn Trp 

2960 2965 2970 

Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Leu Pro Glu Ala Ser 

2975 2980 2985 

Arg Leu Asp Leu Ser Gly Trp Phe Thr Val Gly Ala Gly Gly Gly 
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